Slightly-linked thoughts on the Semantic Web

I’d like to take a moment of your time and some of your brain’s real estate.  While I may link to some articles, I will not pretend that I am providing much information on them.  Think of this as a commentary from someone who has spent a few decades hunting and pecking at the edges of the semantic web.

Again, I am well aware that the LS566 class’ thoughts on the semantic web probably SHOULD be limited to, “Can this information get me a job?” and “How much of my grade is based on this?”  But accept this little bit of trivia/anecdote/Zen koan, and next time someone asks why you’re in library school, you can say, “Well, I like books, but I’m also contributing to breakthroughs in artificial intelligence.”  I guarantee, within the next couple of your sentences, they’ll find someone else to talk to, which is why I say most things in conversations.

First story: In 1995, I was in computer science graduate school at Cornell (before dropping out), I took a class in data searching.  One of the systems we worked with could do an amazing thing:  Given a block of text, like an encyclopedia, it could extract topic sentences for a section, find related articles (for example, when asked about “detective novels,” it could tell you that “Edgar Allen Poe” was the most-related article), summarize, all sorts of good things.  But it was not an artificial intelligence program; it was a word counter, mostly.  This was back when AI programs were still struggling to recognize a coffee cup sitting on a counter (as a yes/no question).  The professor’s point was that, if something so stupid could do a task that people find intelligent, then artificial intelligence had some real hurdles.  I posit instead that it’s a lot easier to create behavior of use to intelligent beings when the input has the intelligence built in.  If it were random words, the system wouldn’t have done very well.  The more intelligence put into the source material, the better the behavior.

Linguistics trivia:  There are three level of relationships in language.

  1. Subject related to nothing else: I sit.  Joey plays.
  2. Subject related to an object:  I see Susan.  Joey hits the ball.  This book’s subject is Hollywood celebrities.
  3. Subject related to the relationship between two objects (anything more complicated can be broken down to a combination of these three):  I see that Susan is getting on the bus.  I am searching for “Who is Jamie Lee Curtis’ mother?”

I really do think that the semantic web is working to help machines handle, not just data about objects, but data on relationships between networks of objects, something beyond the “this has something to do with this” nature of web links.

Child psychology trivia:  When children learn language, they spend some time asking, “What’s that?” but then they spend their energy on naming things, then paying attention when someone corrects them.

General thought on getting support for ideas:  The one nice thing of building intelligence by organizing complex things into simpler concepts, instead of starting with simple object recognition, is that a system that returns a lot of guesses on relevance from a large pool of data, being half right, is much more useful than something that does simple only half right.

Remember Svenonius?  Yes, a lot of the book was on the structure of surrogate record vocabularies, but it was all in the service of discovering which portions of it could be best automated.

In short, while the semantic web and linked data provide many benefits in the here and now, and getting library information to the public is important, and I’m sure that Sir Tim Berners-Lee really wants Congress to know about public info monitors and DB interoperability, one of the long-term goals of this idea must be to test out various micro-representations of physical objects and events in order to someday get machines, based on the examples we’ve given (“In this event announcement, we want to know this, this, and this, and here’s where they are”), to extract their own linked data tags and connect them to the correct URIs.  While this is a vital project, given all of the new data being created all the time, libraries and librarians are needed to judge these evolving data structures, pay for standardization, develop new structures as situations warrant, make sure that the fruits of this work are available to all, and help make sure that protections of information remain in place when needed.

3 thoughts on “Slightly-linked thoughts on the Semantic Web

  1. beccabillings3 says:

    I like the child trivia. My daughter has begun to do just that. Interesting stuff! I do like the professor explaining that a.i. has hurdles by being easily impressed by common things. And it’s true, “the more intelligence WE put into a search, the better we get back.”


