Re: [CODE4LIB] QR Code replacement for business card

2023-12-05 Thread Kyle Banerjee
> > Maybe one day these miscreants will phish our QR code business cards for > their benefit? > It's great that people are conscientious about putting safe info out there. However, the people scanning codes are the ones who need to worry. Even if everyone considers you 100% trustworthy, they still

Re: [CODE4LIB] tar without compression? General comments welcome

2023-09-21 Thread Kyle Banerjee
Tar works fine with or without compression. If space isn't an issue and you have a fast connection, it can be faster to simply transfer uncompressed than to go through the compression/decompression overhead. kyle On Thu, Sep 21, 2023 at 11:35 AM Esmé Cowles wrote: > That seems like a reasonable

Re: [CODE4LIB] Graphics - modifying

2023-05-05 Thread Kyle Banerjee
If you mean you want to reduce the dimensions, the Microsoft Windows Powertoys are great for this -- you can do large batches at once. Imagemagick is much less friendly, but can convert, resize, change compression, and a bunch of other things. kyle On Fri, May 5, 2023 at 9:13 AM charles meyer w

Re: [CODE4LIB] Systems - to librarian or not to librarian?

2023-02-17 Thread Kyle Banerjee
Delivering good service is all about understanding pain and joy points -- which those without library experience won't have. My consistent observation both working in libraries and as a vendor rep with systems personnel with all kinds of institutions is that library experience is very important fo

Re: [CODE4LIB] "The rows are not all the same number of columns."

2023-01-16 Thread Kyle Banerjee
> loadablefile.csv On Mon, Jan 16, 2023 at 4:53 PM Kyle Banerjee wrote: > CSV is an unfortunate format because there's not a single accepted way to > do quoting and escaping. Excel is a problematic tool because it often does > things to your data that you don't want. > &g

Re: [CODE4LIB] "The rows are not all the same number of columns."

2023-01-16 Thread Kyle Banerjee
CSV is an unfortunate format because there's not a single accepted way to do quoting and escaping. Excel is a problematic tool because it often does things to your data that you don't want. Counting commas won't work unless none of your data fields contain commas. However, if that happens to be a

Re: [CODE4LIB] Digital Asset Management systems

2022-08-26 Thread Kyle Banerjee
Hi Elizabeth, If you're looking for the right DAM, I'd recommend getting some face time with different people to talk about their experiences and then see what sounds like it might be the best fit for your situation. This is a great bar discussion -- few say anything of substance if there's a chan

Re: [CODE4LIB] Certifications for library IT people?

2022-08-16 Thread Kyle Banerjee
rom such a course. On Tue, Aug 16, 2022 at 4:28 PM Kyle Banerjee wrote: > >> As far as certification, my humble opinion after 25+ years in academic >> (and now Library) IT work, is that certifications are only for hiring >> managers who want something to check off on an appli

Re: [CODE4LIB] Certifications for library IT people?

2022-08-16 Thread Kyle Banerjee
> > > As far as certification, my humble opinion after 25+ years in academic > (and now Library) IT work, is that certifications are only for hiring > managers who want something to check off on an applicant or pointy-haired > bosses who don't actually understand IT ...I would be suspicious

Re: [CODE4LIB] Tricking the Dragon

2022-06-28 Thread Kyle Banerjee
Tricking is probably not an option here. What Dragon interacts with (i.e. 64bit system and programs) is probably your limiting factor. Is there a reason Windows Speech Recognition wouldn't be an option? Kyle On Sat, Jun 25, 2022, 7:25 AM charles meyer wrote: > My esteemed listmates, > > > > I

Re: [CODE4LIB] Send control characters from an Android keyboard

2022-06-21 Thread Kyle Banerjee
pport every character and special key available on a physical keyboard: > https://github.com/Julow/Unexpected-Keyboard > > Andrew > > On Tue, Jun 21, 2022 at 7:27 PM Joe Hourclé wrote: > > > On Jun 21, 2022, at 2:44 PM, Kyle Banerjee > > wrote: > > > > >

[CODE4LIB] Send control characters from an Android keyboard

2022-06-21 Thread Kyle Banerjee
My actual objective is to use my phone to perform simple tasks on EC2 assets (ugly, but still very useful). Connections are only permitted via Session Manager or terminal within Node Actions of the AWS GUI -- no ssh. The AWSU GUI terminal option works pretty well and makes connecting to bookmarked

Re: [CODE4LIB] Learning English Conversationally

2022-02-24 Thread Kyle Banerjee
I can't speak to learning English specifically, but anyone who's had to learn another tounge they didn't grow up with knows it's hard. My normal recommendation would be that reading and watching stuff that she already knows in English (English subtitles is OK, but no translations) can help develop

Re: [CODE4LIB] code4lib mailing list over the years

2022-01-14 Thread Kyle Banerjee
On Fri, Jan 14, 2022 at 2:45 PM Tim Spalding wrote: > "Mailing lists aren't what they used to be. Of the mailing lists in which I > subscribe, zero discussion happens. There are really only announcements. I > suppose the Code4Lib mailing list is no different." > > At the risk of starting a discus

Re: [CODE4LIB] ethics of screenscraping library opacs?

2021-11-29 Thread Kyle Banerjee
Regarding Z39.50, they may have SRU enabled which would make this really easy. Even if searches have to be passed through the UI, Primo interfaces allow you to retrieve a structured PNX record via GET query parameter. With the small number of searches, the harvest plan should work well. kyle On M

Re: [CODE4LIB] ethics of screenscraping library opacs?

2021-11-27 Thread Kyle Banerjee
A screen scrape harvest will likely generate more search activity than all actual searches combined while being really slow. Assuming a process where every call gets you a non duplicated record, a catalog of a relatively modest 2.5 million records would take you a month to retrieve at one record pe

Re: [CODE4LIB] using dublin core to express size measured in words

2021-10-19 Thread Kyle Banerjee
On Mon, Oct 18, 2021, 07:03 Eric Lease Morgan wrote: > > Using Dublin Core, is there a way to express size measured in words? > Tldr; using the word "words" is totally legit as is doing anything that makes sense to you. DC is a container standard like MARC or HTML -- you can think of it as a fo

Re: [CODE4LIB] Do we have any OAI programmers here? I have questions.

2021-09-13 Thread Kyle Banerjee
Hi Jill, I did a CONTENTdm to Omeka migration many years back. Wilhelmina pretty much covered what you need to be thinking about, but first thing to know is that OAI is rarely an appropriate migration tool. The second is that the whole point of migrating is that the new system works differently th

Re: [CODE4LIB] Library of Congress releases data

2021-09-08 Thread Kyle Banerjee
The data of interest is always a few years old. I assume this is because there are subscriptions for the new stuff. If you wanted to keep collecting the new stuff, I assume this would be possible via OAI or SRU with appropriate limiters. kyle On Wed, Sep 8, 2021 at 12:18 AM Michael Lackhoff wrot

Re: [CODE4LIB] Library of Congress releases data

2021-09-07 Thread Kyle Banerjee
The Cataloging Calculator has relied on free LC data from the beginning. What they provide now is so much easier to work with than it was in the beginning (literally scanning and parsing stuff off paper) Haven't used much in terms of tools to work with it, sed and vi for manipulation (Marcedit for

Re: [CODE4LIB] DDC is like an API specification so it can be used freely

2021-04-14 Thread Kyle Banerjee
A couple quick thoughts: - DDC isn't a structure in the same way that those other things are. DDC defines a conceptual universe (portions of which are regularly redefined) as well as ways of navigating it. In this sense, the navigation has more in common with menuing and the content ha

Re: [CODE4LIB] how our vocabulary is changing

2021-02-03 Thread Kyle Banerjee
Looks like a shift from operational to general concepts (except a few that point at other things like https and doi). With much greater complexity everywhere, one would hope discussion isn't dominated by super basic stuff. But with greater specialization, such terms may simply represent lowest com

Re: [CODE4LIB] is there a service to reconcile publishers' names?

2020-09-17 Thread Kyle Banerjee
Hi Sergio, As Debra mentions, it's a mess -- what you're basically asking for is an authority file for a field that's not authorized. The publisher field is transcribed from what's on the piece so there will be variations even before you consider splits, merges, acquisitions, name changes, etc wh

Re: [CODE4LIB] MS word doc table to Excel

2020-09-16 Thread Kyle Banerjee
Hi Amy, The easiest way I can think of is just convert the table to text (click in the tablek and use the table layout tools to get to the function). That can either be saved as its own document which can be imported into a spreadsheet or pasted in. kyle On Wed, Sep 16, 2020 at 8:12 AM Amy Schul

Re: [CODE4LIB] kernel threads

2020-06-22 Thread Kyle Banerjee
Hi Eric, A few random thoughts -- basically all in agreement with what you've already heard. Agreed that subshells and I/O are what cost (I suspect disk IO in your case) and that playing with the number of parallel processes may help. Rewriting will only help if the dependencies and whatnot you'r

Re: [CODE4LIB] RegEx for public services/public-facing librarians

2020-01-16 Thread Kyle Banerjee
On Thu, Jan 16, 2020 at 6:43 AM Mike Monaco wrote: > Good morning, > A colleague and I are planning a workshop on using regular expressions and > expect an audience of primarily public services librarians. I was hoping > other users here could suggest some applications of regex that would be > us

[CODE4LIB] Universal viewer woes

2020-01-15 Thread Kyle Banerjee
Hi all, After a recent upgrade to our Samvera installation, we've been dogged with "Your log-in attempt did not appear to be successful. Please try again" errors in the IIIF viewing window. When the error occurs, it occurs with everything. But sometimes everything works. I haven't been able to fi

Re: [CODE4LIB] matching brief cataloging to OCLC records - scores

2020-01-07 Thread Kyle Banerjee
28 that matches this number (not the same > composer/title though). > > Thanks, > Cindy > > -Original Message----- > From: Code for Libraries On Behalf Of Kyle > Banerjee > Sent: Monday, January 06, 2020 5:55 PM > To: CODE4LIB@LISTS.CLIR.ORG > Subject: Re: [

Re: [CODE4LIB] matching brief cataloging to OCLC records - scores

2020-01-06 Thread Kyle Banerjee
Hi Cindy, Could you say a bit more about your project -- i.e. how many items you're dealing with, uniqueness of records you need to match against, reliability of the individual data points you're using for your key? As an abstract proposition, dirty matches are tricky. The basic approaches are to

Re: [CODE4LIB] Piecing together an offline search interface?

2019-11-11 Thread Kyle Banerjee
On Mon, Nov 11, 2019 at 10:06 AM Kyle Breneman wrote: > My university has a program that offers classes at a nearby prison, and > this program is about to get a bunch of new laptops. As many of you know, > prisons are pretty restrictive and inflexible regarding technology... > My gut reaction

Re: [CODE4LIB] Identifying description sources across a large corpus of MARC records

2019-09-23 Thread Kyle Banerjee
Hi Tim (and apologies to everyone for being so chatty on this topic) After your post, I've had your question in the back of my mind while working on other things -- the problem of identifying/improving low quality data is interesting given our growing reliance on publisher data. References to the

Re: [CODE4LIB] Identifying description sources across a large corpus of MARC records

2019-09-19 Thread Kyle Banerjee
The question of intentional addition of promotional, misleading, or low value information to catalog records is an interesting one. Makes me wonder how often DOIs or shortened URLs resolve to paywalls or even worse, affiliate links. On Thu, Sep 19, 2019, 11:30 McDonald, Stephen wrote: > I think

Re: [CODE4LIB] Identifying description sources across a large corpus of MARC records

2019-09-19 Thread Kyle Banerjee
I like the concept and think the signals you're already looking at (length, evaluative adjectives, etc.) are a solid way to go. If you haven't already looked at several thousand 520s to see what jumps out, I'd definitely do that. I would expect a combo of publisher and date of imprint (perhaps als

Re: [CODE4LIB] Keyword Extraction from Text

2019-09-16 Thread Kyle Banerjee
Hi Athina, The extractors are very different in terms of what they're optimized to work with and what they're designed to extract -- you need one designed for your purposes, and you may need more than one. A few years back, I experimented with a number of extractors before settling on Alchemy a

Re: [CODE4LIB] ORCID

2019-07-30 Thread Kyle Banerjee
On Tue, Jul 30, 2019 at 8:16 AM Bigwood, David wrote: > ORCID provides an API that can be queried by ORCID or institution name (I > think). Has anyone written code to query the API on a regular basis to pull > down articles by their faculty? Is it something you'd care to share? I've > not the fai

Re: [CODE4LIB] From the Community Support Squad wrt "Note [admiistratativia]"

2019-07-14 Thread Kyle Banerjee
On Sun, Jul 14, 2019 at 7:26 PM Stuart A. Yeates wrote: > I was personally ambivalent about anonymity on the mailing list. > > However, the fact that it appears to be predominantly men arguing for > banning anonymity and women arguing for allowing it is a tell that us > men folk might have our lo

Re: [CODE4LIB] From the Community Support Squad wrt "Note [admiistratativia]"

2019-07-14 Thread Kyle Banerjee
On Sun, Jul 14, 2019 at 4:21 PM Fitchett, Deborah < deborah.fitch...@lincoln.ac.nz> wrote: > ...If a dog has something useful to post to Code4Lib, why shouldn't they > be judged on the merits of their email? > Anonymity doesn't work in all environments, but it tends to work in small relatively st

Re: [CODE4LIB] From the Community Support Squad wrt "Note [admiistratativia]"

2019-07-11 Thread Kyle Banerjee
On Thu, Jul 11, 2019 at 1:55 PM Anne Slaughter < anne.slaugh...@railslibraries.info> wrote: > I hear you. Personally my web dev past is so ancient I'm officially not a > coder and have to work to get my head around GitHub every time I touch it. > A number of people have shared similar sentiments

Re: [CODE4LIB] recommendations for/advice re: Wordpress managed hosting

2019-07-03 Thread Kyle Banerjee
On Wed, Jul 3, 2019 at 11:50 AM Josh Welker wrote: > I agree with everything Kyle said except that it doesn't make you better at > other things. The sysadmin skills I learned in figuring out self-hosting > have transferred to many parts of my career inside and outside libraries > and have helped

Re: [CODE4LIB] recommendations for/advice re: Wordpress managed hosting

2019-07-03 Thread Kyle Banerjee
On Wed, Jul 3, 2019 at 5:46 AM Andrew L Hickner wrote: > Dear colleagues, > > We are exploring moving the Wordpress website for the local chapter of our > library association to managed hosting. I'd appreciate any advice and/or > provider reviews you are willing to share, and would be happy to su

Re: [CODE4LIB] [EXTERNAL] [CODE4LIB] Checking out and supporting equipment for patrons

2019-06-11 Thread Kyle Banerjee
rmation Technology Technical Associate > Brookens Library > University of Illinois Springfield > (217)206-7115 > Pronouns: He/Him<https://pronoun.is/he> > > > > -Original Message- > From: Code for Libraries [mailto:CODE4LIB@LISTS.CLIR.ORG] On Behalf

[CODE4LIB] Checking out and supporting equipment for patrons

2019-06-11 Thread Kyle Banerjee
If your library does this, how do you manage the myriad of batteries, remotes, cables, cards, mounts, instructions, etc. and what support to you provide for its use? We will make GoPro equipment available to patrons soon. But simply handing this stuff over to people who are unfamiliar with it soun

Re: [CODE4LIB] pattern libraries

2019-05-14 Thread Kyle Banerjee
On Tue, May 14, 2019 at 11:36 AM Birkin Diana wrote: > ... > Inspired by a University web redesign, a few of us in the Library are > beginning to investigate "pattern-libraries" to help us make and keep the > look & feel of our disparate systems more in-sync with one another > ... > > To thos

Re: [CODE4LIB] ArchivesSpace reCAPTCHA

2019-04-24 Thread Kyle Banerjee
Captchas are like frequently expiring passwords with ridiculous validation requirements and canned security questions people can't remember the answers to -- great examples of solutions built around administrative concerns at the expense of the users they're meant to serve. I cannot fathom how di

Re: [CODE4LIB] Looking for lightweight tool to identify PII

2019-04-22 Thread Kyle Banerjee
Hi Kim, Could you say a bit more about the documents, the scanning process, and how reliable the OCR is? I'd be leery of relying on OCR for identifying PII except as a secondary check (which may be already be your plan). PII takes many forms which often require a trained eye to spot -- particular

Re: [CODE4LIB] Remote Work Policies

2019-04-18 Thread Kyle Banerjee
Hi Jenny, Every place I've worked except one over the past 20 years allows remote work. No two policies were the same -- not a surprise given that the missions, duties, and environments were different. What appears most effective in my experience is for the manager to determine what's appropriate

Re: [CODE4LIB] Online data transformation tools

2019-04-15 Thread Kyle Banerjee
On Mon, Apr 15, 2019 at 11:20 AM Thomas Dunbar wrote: > Hello everyone, > > I'm working on a proof of concept web application for common library data > conversions with support for large files. > The application is build using a serverless architecture, which allows me > do this at scale and at l

Re: [CODE4LIB] nsf api and citations

2019-04-12 Thread Kyle Banerjee
On Fri, Apr 12, 2019 at 6:55 AM Kevin Hawkins < kevin.s.hawk...@ultraslavonic.info> wrote: > Publishers are increasingly storing such data in article > metadata as well, so in the long run, I expect this kind of searching on > publications will become easier. My confidence in this is tempered. H

Re: [CODE4LIB] Help with XML files?

2019-04-09 Thread Kyle Banerjee
On Mon, Apr 8, 2019 at 7:44 PM Mackenzie M. Salisbury < suzanne.salisb...@mail.waldenu.edu> wrote: > It might just take some minor updates to the Java to make the JAR file > work (I honestly don’t know if it’s just some tweaks or if the whole thing > need to be rewritten) Hi Mackenzie, Diagnos

Re: [CODE4LIB] Workflows/tools question for massive weeding project

2019-04-04 Thread Kyle Banerjee
Haven't helped with a major weeding project in a long time, but my gut reaction when I saw the desired workflow is that you might find it easier/faster to go lower tech. Weeding is a very physical process that is ergonomically difficult to integrate with equipment and apps requiring precise use of

Re: [CODE4LIB] named entities as metadata

2019-03-29 Thread Kyle Banerjee
I tried to populate metadata using entity extraction in a project some years ago and encountered the same issues as you. To cut straight to the chase, regex normalization routines based on eyeballing several thousand entries worked as well as anything. We were unable to come up with an effective w

Re: [CODE4LIB] CONTENTdm Migration - words.txt files

2019-03-25 Thread Kyle Banerjee
Hi Sara, What are you migrating to? It's generally easiest to start with what your new system expects and then figure out how to get it that. There's a good chance that the most straightforward approach will be to perform OCR as part of the ingest/migration process. kyle On Mon, Mar 25, 2019 at

Re: [CODE4LIB] COinS

2019-03-19 Thread Kyle Banerjee
If you have to ask, you have your answer :) On Tue, Mar 19, 2019 at 8:00 AM Bigwood, David wrote: > Are COinS still of any value? Or should I clean-up my pages by getting rid > of them? I haven't heard them mentioned in years. > > Thanks, > David Bigwood > dbigw...@lpi.usra.edu

Re: [CODE4LIB] hathitrust api

2019-02-11 Thread Kyle Banerjee
Haven't used the Hathi API before. Is multithreading possible or do tech/policy constraints make that approach a nonstarter or otherwise not worth pursuing? Peeking the documentation, I noticed a htd:numpages element. If that is usable, it would prevent the need to rely on errors to detect the doc

Re: [CODE4LIB] PDF Combine Pro Software Alternative

2019-02-04 Thread Kyle Banerjee
If command line tools are acceptable, one option to is to use the convert utility to get everything into PDF and merge them with the pdfunite utility. Usage is simple and you can find examples on the interwebz. HTH kyle On Mon, Feb 4, 2019 at 12:36 PM Faust, Brad wrote: > Hello, > > We have a

Re: [CODE4LIB] rdf and doi's

2019-01-15 Thread Kyle Banerjee
Hi Eric, I'd spin that around and ask how you'd exploit a set of CSV or files in any other format since regardless of how data is obtained, the structure and methods you use should be driven by the problem you want to solve. This is not to say that it's not sometimes fun to have a tool in your ha

Re: [CODE4LIB] Controlled vocab useful for describing skin tone

2019-01-03 Thread Kyle Banerjee
; > On Thu, Jan 3, 2019 at 4:18 PM Kirby, Jasmine S [LIB] > > wrote: > > > Maybe https://www.fentybeauty.com/shade-finder.html since it has > numbers. > > > > -Original Message- > > From: Code for Libraries On Behalf Of Kyle > > Banerjee &g

[CODE4LIB] Controlled vocab useful for describing skin tone

2019-01-03 Thread Kyle Banerjee
Howdy all, I'm trying to wrap my mind around an image project that allows clinicians and students to see how different medical conditions visually manifest themselves on different skin tones and types. I suspect we'll use ICD-10 to describe conditions, but I'm curious as to what recommendations p

Re: [CODE4LIB] UUIDs in MARC 001

2018-12-21 Thread Kyle Banerjee
I would advise against using 001 for this purpose and using 035 instead. Although use of 001 varies, it's most often used as an internal matchpoint. This means using it all but guarantees that the field already has a competing use in the system -- making it a questionable idea to accept them. Be aw

Re: [CODE4LIB] Sharing Test Data Resources Around 1-2 GB

2018-12-12 Thread Kyle Banerjee
The S3 option seems reasonable, lends itself to the permissions sharing you need, and is easily automatable. This solution has the added benefit that developers can simplify transfer by working directly in AWS if they like. In the case at hand, 1-2 GB is downloadable. For stuff that's too big to d

Re: [CODE4LIB] ai in libraries

2018-12-11 Thread Kyle Banerjee
Helpful metadata places things in a conceptual universe. As such, the point is to impose bias whether humans or machines designed by humans make the determinations. Needs, understandings, etc evolve with time so these problems will endure. For example, history probably won't look kindly on a good

Re: [CODE4LIB] BIBFRAME, IFLA LRM, and PREMIS

2018-12-07 Thread Kyle Banerjee
Also not an expert on any of this stuff, but my understanding is that PREMIS is a dictionary that allows you to describe who owns something, whether the thing is legit, what's done to preserve it, what you need to use it, and rights management information. It's a flexible standard -- i.e. there ar

Re: [CODE4LIB] Recommendations for the New Kid

2018-10-17 Thread Kyle Banerjee
Hi Athina, As far as building your knowledge base goes, I've personally found it most useful to learn things as you need them because only those things that you actively use will stick. Then look for commonalities with other things you need and build on that. I don't recommend learning any partic

Re: [CODE4LIB] Are you a coder/programmer or a systems analyst or?

2018-09-27 Thread Kyle Banerjee
On Thu, Sep 27, 2018 at 6:51 AM Carol Kassel wrote: > .. I know I'd love to hear that > someone wants to make things better and not just build shiny new things! > Amen. Just as you need to watch out for seagull managers, you also have to be wary of seagull systems people. A lot of the most

Re: [CODE4LIB] Are you a coder/programmer or a systems analyst or?

2018-09-27 Thread Kyle Banerjee
On Wed, Sep 26, 2018 at 4:02 PM Salazar, Christina < christina.sala...@csuci.edu> wrote: > I think a part of why I'm asking is it seems sometimes (oftentimes?) the > folks who are doing the hiring or job postings don't really KNOW what all > is involved in many of the techie type librarian positio

Re: [CODE4LIB] indexing & searching chinese text using solr

2018-08-14 Thread Kyle Banerjee
Hi Eric, If you're pretty sure you indexed the characters properly and are getting garbage no matter what you do, my first thought is that this is a localization issue. Can you cat/grep/sed/vi/whatever these characters in a terminal window? If not, that is at least part of your problem. Running

Re: [CODE4LIB] Digital archive software package

2018-08-13 Thread Kyle Banerjee
We use MerlinOne and our foundation uses Extensis for what sounds like a similar purpose. We reviewed a number of other options in the past, but my knowledge is too stale to be useful. A few thoughts: 1) The most important thing for the researcher to envision how she hope she and others will inte

Re: [CODE4LIB] Web Server Specs?

2018-08-07 Thread Kyle Banerjee
It might be worthwhile having a conversation to get a feel for what they're equipped to support and what the cost structure will look like for the library. Sometimes, resources that are easy and cheap to provision on the open market can be difficult and (very) expensive to obtain locally. Also find

Re: [CODE4LIB] Trustworthy way to get server ID# from CONTENTdm

2018-07-03 Thread Kyle Banerjee
r. > > Here is a list of the parsers I've already created: > https://github.com/craigdietrich/tensor-profiles/tree/master/parsers > > Once a parser is created, any archive that uses that platform can be > accessed. > > I've attached a couple screengrabs, not sure

Re: [CODE4LIB] Trustworthy way to get server ID# from CONTENTdm

2018-07-03 Thread Kyle Banerjee
Hi Craig, Have you tried using nslookup? That should return the CNAME entry containing what you seek. Also, curiosity is killing me as to why knowing server numbers for machines you don't control would be useful. kyle On Tue, Jul 3, 2018 at 10:17 AM, Craig Dietrich wrote: > Hi all, > > This h

Re: [CODE4LIB] Best way to partially anonymize data?

2018-05-16 Thread Kyle Banerjee
On Tue, May 15, 2018 at 6:01 PM, John Pellman wrote: > Disclaimer: Some of this is probably going to be redundant with respect to > what Becky K has already said. > > Are you using DICOMs? If so, it's pretty straightforward to anonymize > Yes (or at least a lot of it is). I'm still waiting for

Re: [CODE4LIB] API for book descriptions?

2018-05-15 Thread Kyle Banerjee
Hi Christina, One possibility would just be to find a catalog likely to have what you need via SRU or Z39.50 -- the quality of metadata will probably be considerably better than what you can get off Amazon. I think Marcedit has a batch mode that will let you do this so you don't even need to do an

[CODE4LIB] Best way to partially anonymize data?

2018-05-11 Thread Kyle Banerjee
Howdy all, We need to share large datasets containing medical imagery without revealing PHI. The images themselves don't present a problem due to their nature but the embedded metadata does. What approaches might work ? Our first reaction was to encrypt problematic fields, embed a public key for

Re: [CODE4LIB] How to create an EAD file from delimited data, then import into Archivist Toolkit?

2018-05-06 Thread Kyle Banerjee
I'd give Marcedit a shot. It can convert your delimited file to MARC which you can then run through its MARC to EAD converter. kyle On Fri, May 4, 2018, 12:37 PM Yamil Suarez wrote: > Hello everyone, > > At our library we need to import an archival collection's EAD data file > into Archivist To

[CODE4LIB] What data manipulation/analysis problems do you need to solve?

2018-03-29 Thread Kyle Banerjee
Howdy all, I'm working on a book aimed at librarians newer to data manipulation/analysis who need help wrapping their minds around a few core concepts and tools already on their desktop computer (Mac, Windows, Linux) they need to solve real world problems. If there are data problems you'd like to

Re: [CODE4LIB] What DAMS does your institution use, and why?

2018-03-26 Thread Kyle Banerjee
On Mon, Mar 26, 2018 at 9:30 AM, Deirdre F Joyce wrote: > Thanks to Stephen for asking the question. We are starting to get ready to > go through a similar process, so I have appreciated everyone's insights and > resources as well. > My guess is a number of others are wondering the same thing. S

Re: [CODE4LIB] What DAMS does your institution use, and why?

2018-03-25 Thread Kyle Banerjee
Hi Steven, Even if you've been unable to identify a survey of the type you seek, many institutions trying to decide on a system build matrices to compare and contrast systems they consider viable so you might be able to obtain a few of those directly from institutions that have gone through that p

Re: [CODE4LIB] Using AWS to store TBs of digital video?

2018-03-02 Thread Kyle Banerjee
On Thu, Mar 1, 2018 at 2:24 PM, Tom Hutchinson wrote: > > It may make sense to take a hybrid approach. You could store locally > and put a copy in the dark cloud for safekeeping. If you are only > going to have one copy, a reputable cloud storage provider isn't a bad > choice. More than one copy

Re: [CODE4LIB] Using AWS to store TBs of digital video?

2018-03-01 Thread Kyle Banerjee
On Thu, Mar 1, 2018 at 11:11 AM, Kyle Breneman wrote: > 1. A combination of both. Around 60TB already digitized, with over 100TB > more footage still in analog form. > 2. Portable hard drives, a RAID, a Mac Pro tower. > 3. No, we do not have a digitization workflow defined yet. > 4. Define "q

[CODE4LIB] Job Posting: Repository Librarian at OHSU in Portland, OR

2018-02-01 Thread Kyle Banerjee
Oregon Health & Science University (OHSU) Library seeks a creative and service-oriented Repository Librarian. The successful candidate will develop innovative, customized library services in a dynamic environment. Working directly with OHSU faculty, staff, and students, the Repository Librarian de

Re: [CODE4LIB] BIBFRAME nesting question

2018-01-18 Thread Kyle Banerjee
On Thu, Jan 18, 2018 at 2:26 PM, Karen Coyle wrote: > But this gets really head-bangingly hard pretty quickly. Just to say > that we should not assume that FRBR actually works with real data - it > was never tested as such. Which raises the question of why we as a profession pay as much attenti

Re: [CODE4LIB] BIBFRAME nesting question

2018-01-18 Thread Kyle Banerjee
On Thu, Jan 18, 2018 at 11:49 AM, Karen Coyle wrote: > My gut feeling is that you should analyze your own data based on your > own use cases and then posit a model - so that your ideas are clear > before you step into the morass of BF assumptions... This. On Thu, Jan 18, 2018 at 11:53 AM, Jona

Re: [CODE4LIB] computing environments

2018-01-15 Thread Kyle Banerjee
On Mon, Jan 15, 2018 at 9:33 AM, Eric Lease Morgan wrote: > I’m curious to know how computing environment have changed in the past > couple of decades, and what sorts of environments are currently most > prevalent. —E This is sort of like asking about languages -- what you use is partly need dr

Re: [CODE4LIB] curating code4lib

2017-12-12 Thread Kyle Banerjee
On Tue, Dec 12, 2017 at 1:28 PM, Jonathan Rochkind wrote: > > Generally speaking, if you have to wonder about the value of something, > you > already have the answer ;) > > Kyle, I honestly am not sure which answer you are suggesting is the right > one in cases where you have to wonder! > Wh

Re: [CODE4LIB] curating code4lib

2017-12-12 Thread Kyle Banerjee
On Tue, Dec 12, 2017 at 9:10 AM, Eric Lease Morgan wrote: > As I sit here watching my EAD files get indexed by Solr, I ask myself, “To > what degree are we — the Code4Lib community — curating our content?” > > Seriously, our “community” generates content, and the bulk of it takes > three or four

Re: [CODE4LIB] Systems Librarian / software developer

2017-12-08 Thread Kyle Banerjee
On Fri, Dec 8, 2017 at 11:33 AM, Cowing, Jared wrote: > ... > I've noticed that the fields of Data Engineering & Data Architecture are > really coming into their own recently, following the big Data Science > explosion. I've even seen library positions using these terms (though not > for libraria

Re: [CODE4LIB] Systems Librarian / software developer

2017-12-07 Thread Kyle Banerjee
On Thu, Dec 7, 2017 at 12:24 PM, Sarah Weissman wrote: > I agree that a lot of the focus for students in computer science is > centered around big industry/big salary type jobs, but I think even in the > “work that matters” sector, the software developer pay scale outpaces the > librarian pay sca

Re: [CODE4LIB] Anyone web scraping to benefit their library?

2017-11-28 Thread Kyle Banerjee
Howdy Brad, Jason nailed it on the head. Scraping is what you're reduced to when API's, extractions, DB calls, shipping drives, mounting data on shared infrastructure and the like aren't viable options. Also, scraping sometimes gets you precombined or preprocessed data that would otherwise be a pa

Re: [CODE4LIB] Comparing Barcodes Between 2 Files?

2017-11-22 Thread Kyle Banerjee
Howdy Renate, I had sent it directly. cat file1 file2 |sort |uniq -d > duplicate_barcodes For Windows users for whom an emulation layer such as VirtualBox or Cygwin is overkill and who encounter issues implementing Gnuwin, there are two options: 1) Windows 10: Enable linux subsystem (it's a Win

Re: [CODE4LIB] direct descendants of a given element

2017-11-17 Thread Kyle Banerjee
On Fri, Nov 17, 2017 at 2:29 AM, Ellerbeck, Carol wrote: > On the other hand, is it really a case of two affiliations, or just one? > The University of Maryland University College (UMUC) is an American public > not-for-profit university located in Adelphi in Prince George's County, > Maryland >

Re: [CODE4LIB] direct descendants of a given element

2017-11-16 Thread Kyle Banerjee
Howdy Eric, Are you using XSLT because you need to, or is it because your source is in XML? Also, do you have any environmental constraints? The reason I ask is that converting that specific XML to the tab delimited output you specify looks like it would be much easier via conversion to JSON and

Re: [CODE4LIB] Suggestion for an in-office sharable digital image asset system?

2017-11-14 Thread Kyle Banerjee
I agree that it really depends on what is needed. For example, how many images are there, what kind of capabilities do you need (e.g. metadata fields, searching, batch processing, workflows)? If your needs are simple enough, there's no particular reason why you couldn't just use the regular filesy

Re: [CODE4LIB] Cloud options

2017-10-31 Thread Kyle Banerjee
Howdy Virgil, What you need your solution to do -- i.e. what do you/will you have, what do people need to be able to do, what kind of integrations you need, what kind of local expertise you have, and long term objectives -- should drive your decision. Could you say more about your needs? kyle On

Re: [CODE4LIB] Help with parsing dates?

2017-10-30 Thread Kyle Banerjee
Hi Julie, I think this will be easiest if you break your problem into chunks and process it using multiple steps. Have you used the stream editor (sed) before? Sed would be very useful for converting many of the variations you list into a format understood by Excel/OpenRefine, normalizing the var

Re: [CODE4LIB] Lightweight IR infrastructure

2017-10-27 Thread Kyle Banerjee
On Fri, Oct 27, 2017 at 1:53 PM, Josh Welker wrote: > How do you handle versioning for metadata? Is that something Samvera does? > My inclination would be to store the metadata in some sort of plaintext > file (rdf/json or whatever) and then just throw all the files into a Git > repository. > Th

Re: [CODE4LIB] Lightweight IR infrastructure

2017-10-26 Thread Kyle Banerjee
On Thu, Oct 26, 2017 at 2:48 PM, Josh Welker wrote: > Kyle, are all your Glacier/S3 assets backed up by a person, or is it > automated as part of an IR software package of some sort? > Both. This past year, we started using a semi-automated process for objects not in the repository -- there's a

Re: [CODE4LIB] Lightweight IR infrastructure

2017-10-26 Thread Kyle Banerjee
On Thu, Oct 26, 2017 at 7:03 AM, Jonathan Rochkind wrote: > I think it's actually worth interrogating and getting specific about what > we mean by "preservation features". > > I think they may not actually be all that complicated or hard to add on to > nearly any solution. I think an actual 'rep

Re: [CODE4LIB] clustering techniques for normalizing bibliographic data

2017-10-25 Thread Kyle Banerjee
On Wed, Oct 25, 2017 at 8:57 AM, Eric Lease Morgan wrote: > ...My bibliographic data is fraught with inconsistencies. For example, a > publisher’s name may be recorded one way, another way, or a third way. The > same goes for things like publisher place: South Bend; South Bend, IN; > South Bend,

Re: [CODE4LIB] Fiscal continuity vote now open [radical idea]

2017-10-24 Thread Kyle Banerjee
I would be leery of interpreting abstention in that way. Similar logic has been employed in some states to prevent referendums involving tax increases to be passed. My sense is that the low vote total reflects that people understand this is a serious issue requiring an informed decision. Those who

Re: [CODE4LIB] Persistent Identifiers for organizations/institutions.

2017-10-14 Thread Kyle Banerjee
On Sat, Oct 14, 2017 at 9:12 AM, Edward Summers wrote: > > > On Oct 14, 2017, at 7:09 AM, Stuart A. Yeates wrote: > > > > archive.org web harvests include at least some DNS details for the > content > > they harvest. I'm not sure how comprehensive it is and I'm pretty such > that > > there isn't

  1   2   >