Data Wells Workshop @ ELAG2010

Workshop process & notes:

http://sites.google.com/site/elag2010datawell/

Workshop results

Exploratory

Four themes:

  • Choosing and Evaluating Software
  • Accessing and Ingesting
  • Scaling and Performance
  • Searching and Search Enhancement

Choosing and Evaluating Software

Prototypes are very illustrating

...but will most certainly not scale

The data you handle will end up defining the system

Accessing and Ingesting

Be aware of formats and representations

Licensing schemes are fundamental - a source of trust

Licenses on software, data, metadata, results of processing within the data well &c.

Preparing the representation format for data exposure

Scaling and Performance

The general case of software scaling

-but will the internal representation format handle heterogeneous data?

...in very large amounts?

Searching and Search Enhancement

Standardizing representation formats can help, be aware that it must service:

  • Minimal requirements for searches or identification
  • Data exposition (linked data or enhancing searches)
  • Different types of processing
  • Segmenting of searches

So, What Is a Data Well Anyway?

A Glorified Object Repository?

Not just about searching or persistance

Very much about metadata

Business cases are good starting points for defining data wells

Thank you

stm@dbc.dk

http://sites.google.com/site/elag2010datawell/