UBIO or is it TNS?

Tue Jul 9 13:50:20 CDT 2002

Dr. Thompson - Thank you for the comments and criticisms.  I absolutely agree with all of your points regarding the difficulties of this or any comprehensive endeavor in this area.  It's a big, messy problem.

There are some points that I would like to address which reflect our approach to this problem because when we started out with this it was with the awareness of all the larger endeavors that haven't provided us with a workable solution as well.

1. Scalability - We decided right away that we do not want a system that can only succeed when everyone jumps aboard and endorses it.  I can't risk that.  The infrastructure has to give value at small scales and then be able to grow to accomodate any increase in usage.   We can't afford to have to convince you or anyone else that this is a good idea in order to progress.  UbIO will not be a failure because it doesn't need to meet everyone's needs or have all names in order to succeed.  I already know how surly this discipline is.  Our strategy is to think big but start small.  Our pre-funding TNS already works for us.  It gets better and more functional as we add functionality to the core API.  Once we have the functionality we have already designed and are implementing in our current system we will have something we and others can use right away.  We can do quite a bit with a very little.  Other libraries already want it and in fact, almost anyone with our sort of data issues does.  We will make something that improves data access and management to our collections.  We really don't have a choice.  Taxonomic informatics might be a mess within the discipline but it's a bigger mess without when we have all the same issues and no means to organize it.  UbIO is simply a mechanism to try to get those who need this taxonomic metadata closer to those who have access to it.  If those who have it don't wish to make it available then we will use what we can access.  It's really not just an issue about names.

2.  The development of TNS is premised on all of the informatic naivete you indicate Dr Thompson.  We do love your classifications of course but we don't think valid names should be unique keys and we really have made an attempt to understand the problems that you know and ignore.  Our view is that these are tractable problems and will be solved and I haven't seen anything yet that leads me to believe otherwise.

TNS has at it's core the means to qualify and record any name and any classification.  Period.  Hierarchical or not.  That's the data model and it's a work in progress.   What comes out is only as good as what goes in.  When we design a data model that we want to work we make sure it accommodates reality.  Homonymy, synonymy, multiple classifications, different species concepts, cladistic models, etc. are all part of reality and are issues we address.  Our viewpoint is exterior to systematics and while we do look at it from an informatics perspective but we also come from the biological perspective as well.  We have a taxonomist on our team, and others with advisory roles.  Any other input is welcome.

Our view is that systematics and taxonomy are moving targets but they are based on reasonable and logical principles.  Revisions don't appear a priori or at least without some reasonable and tractable basis.  Names refer to something and those things can be qualified and related.   TNS is concept-based, founded in the data model of the Unified Medical Language System.  As messy as sytematics appears, medical terminology is as messy or, arguably, even more so.  UMLS began when it was realized there really wasn't a choice but to create it.  We find ourselves in similar straits.  We need tools, we need an infrastructure and the current suite of what is available hasn't worked for us.  Huge collections of names is only the beginning of a system.  Effective and flexible delivery is at the heart of it.  I believe this infrastructure can meet the needs of taxonomists and external uses and it doesn't require buy in from everyone to get started.  As we progress I hope we can demonstrate this.

If UBIO ends up as "another acronym added to the pile" it will be because it will have strayed from our current ideology.   If it doesn't stray, it won't fail because we are and will get results from this system.  We are designing it to work at a large scale on the chance it can deliver value to others.  We are designing it to provide increasing value through increasing usage.

The demand is there and the 'trick' of using the bluefish synonymy to locate the articles is less a trick than a reflection of the reality of data objects in time.  I can't comment on the ITIS experience but I know that any application I can think of that I have built as well as most that I've seen from among the publishers, libraries, and museums ive shown this to would use such a system.  Again, name coverage isn't the main issue.  Tools and a simple, language-independant API that don't require a user to learn yet another arbitrary coding system or to become dependant on some potentially volatile authority can save folks alot of time and duplication of effort.

I can go on as you can imagine but I'll stop here.  I can describe more of our plan only if there is interest.  We really are trying to build upon what we see as the realities of the discipline and any ignorance we have we want corrected.

Thanks for your time.

--
-------------------------------------------------------------------------------
David Remsen
Information Systems Program Developer: UbIO
Marine Biological Laboratory
Woods Hole, MA 02543
508 289 7632
-------------------------------------------------------------------------------
Technical Chair/Web Administrator /
National Association of Marine Laboratories/www.naml.org
Shark Research Institute/www.sharks.org