TRENT: 2011

Monday, November 07, 2011

Data-Driven Documents

Data-Driven Documents. What's that?

D3.js is a small, free JavaScript library for manipulating documents based on data.

You have a data source requires visualization and a chart library is not enough for that purpose. This little library helps you to develop the visualization using the DOM model and data driven transformation.

A good example of whats can be done is available here: http://www.visualizing.org/full-screen/16266

Monday, October 24, 2011

Reactive documents

Stumbled upon a JavaScript library to simplify the creation of reactive documents, called Tangle.

Whenever you have to explain scenarios / alternatives in a sensible manner a reactive document is one possible way to do so. Changing a parameter and directly see the impact for all dependent parts of the information / document is a very sufficient way to teach certain problems / solution scenarios.

This library makes the creation very simple (from a technical point of view) and you are able to concentrate on the content scenario which is of course the harder part.

XQuery - rare used but ramp-up gets easier

A very nice overview of what XQuery is, what are the relation to other XML based standards and why it is still very rare used is summarized here http://grtjn.blogspot.com/2011/10/xquery-novelties-revisited.html.

Once you want to dive into XQuery a bit deeper, get a feeling if the approach is sufficient for your use case you should try out BaseX. A XML database, open source. The main advantage for tryouts which are not available on alternative solution is a lightweight but very useful UI fronted for the content stored in the database.

Adding XML documents, try out queries and see how / what they match and last but not least a noticeable view of how information in a XML store looks like (from a conceptional point of view) comes aligned with easy install and ramp-up costs.

Alternatives (commercial and open source) are collected here: http://trent-intovalue.blogspot.com/2010/08/xquery-design-patterns.html

Tuesday, September 13, 2011

Valuable Information

Today I stumbled upon the following twitter post:

Every human intervention in a business process introduces a 4% chance of error. - B. Beims

Sounds interesting and relevant in the context I'm working in. Than I tried to verify the source and basis for this statement.

Using google to search for the statement
Using google to search for the author
Finding second / third source for this statement

To be honest I wasn't able to verify what I have to verify and therefore use any bit of this information. Therefore I take this statement as a trigger for this blog post - better than nothing.

That is a example of todays most common topic today:

more and more "characters" are accessable and flowing around the world, like "Chinese whispers" posted, re-posted, extended, ....
less and less of the accessible" information" (in terms of percent based on the complete available total amount of "information") is relevant or valid
Shorten / context less "information" does not lead to human usable information chain

That is just a fact and reality - everyone has to deal with. To improve your personal ability to make "characters" to "information" you still have to go the hard way:

Don't use and post a information which is not verified by at least

a second, independent source
or
personal verification
or
background information which provides you with considerable background to trace the information

If you do not have time for this kind of verification - just leave the "characters" as they are and mark them as irrelevant for you. This should make your personal information chain much cleaner and helps you to divide relevant from irrelevant information.

Don't forget: It is never cheap to gather valuable information. It was never and will never.

Sunday, September 04, 2011

Kevin Slavin: How algorithms shape our world

http://www.ted.com/talks/kevin_slavin_how_algorithms_shape_our_world.html

Writing code in most cases does not mean that you can ever control the usage and implication of the results.....

HTML5 and XML

HTML5 will be the main syntax for the Internet in the next few years and will replace the today most frequently used HTML 4.01. Main driver for this shift was Google and is now adapted by all major browser / OS vendors and organizations.

The main advantage of HTML5 are the new amount of build in features which reflect most of the todays common requirements for web based applications.

Why XHTML 1.0/1.1 failed so far? It mainly was much to strict for the web community - the web and also the world is a non perfect place and therefore HTML5 is much more suitable to fit into this world than the XML based approach of XHTML can provide.

Does this mean XML for the web has loose and does not make sense at all. No there is still space for the XML based standards beside HTML5:

XHTML5
XML serialization of HTML 5 with stricter parsing rules
mime-type: application/xhtml+xml
Polyglot XHTML (see http://dev.w3.org/html5/html-author/#polyglot-documents)

Major advantage of using XML to express the content on the web is a much more easier way to integrate the resulting content into XML processing chains using regular XML transformation tool chains.

Easier Reuse content for different channels using XSLT / XQuery
Retrieve content as XHTML and extract only dedicated parts (views) required for different use-case
Store and request the content using XQuery based infrastructure

The drawback is that today only the newest browser support the mime-type "application/xhtml" therefore for a while Polyglot XHTML might be a good opportunity to deliver the mass and keep processing use-cases doable.

A good summary of Polyglot XHTML and related XML based alternatives for HTML5 can be found here: http://www.xmlplease.com/xhtml/xhtml5polyglot/

Thursday, September 01, 2011

analyse and process DTDs

Working with DTD is still a common task for XML (/SGML) driven use-cases. Knowing this it is very amazing that there is no well know DTD visualization tool available supporting this task.

The good old "Near&Far Designer" is gone many years ago and the source is probably lost in the space of Open Text (the company bought Microstar Software Ltd in 1999). This tool is still in use by many organization having to deal with SGML DTDs (e.g. in the military or aircraft industry).

DTD documentation

There are a few open source scripts out there which converting a DTD into HTML pages for documentation purpose which are available free of charge:

DTDDoc
http://dtddoc.sourceforge.net/
LiveDTD
http://www.sagehill.net/livedtd/

There is one tool out there supporting graphical visualization, documentation and a few function to report key function within the given DTD:

TreeVision (http://www.ovidius.com/meta/download/treevision.html) from German company Ovidius. The tool is available free of charge and provides a very sufficient way to analyze XML / SGML DTDs.

Convert to XML Schema alternatives

If you have to process the content of a DTD for specific use-cases like analyzing the model based on custom specific rules the easiest way is to convert the DTD to RELAX NG (XML syntax) or W3C Schema language. Both are based on XML and therefore can be processed using regular XML based tools.

The best tool to do support this is trang (http://www.thaiopensource.com/relaxng/trang.html source is hosted on http://code.google.com/p/jing-trang/) initially created by James Clark. Compared to commercial alternatives the result is very predictable and for many use cases as good as possible.

DTDs will still exists for many years just because of the many legacy applications created around them. The amount of support is limited but still exists....

Monday, August 01, 2011

lost in email threads....

One of the most time expensive daily tasks is to identify the email required for the the current task in mind.

You know that you already received a email for a particular topic and you want reference it, you require the technical details for a certain topic, ....

Using email tags and full text search of todays email clients is a quite sufficient help to get those kind of tasks done. But once you find a particular email, they is almost ever part of a thread back and forth and getting what you want requires to get the context of the found email. To get the complete context in emails thread isn't trivial, even with thread functions of the common email tools. Most of them are limited in what is shown in particular for long running threads:

you loosing the message context around the identified message because the threading function re-arrange the way your inbox is displayed
you don't have an easy to use visibility of what really happens, what are the timings for each mail in the task the corresponding sender
you do not have easy navigation without loosing the context

Few weeks ago I got aware of ThreadVis a add on for Thunderbird email client.

Pretty cool, it provides a visual graph of the email thread based on the currently selected email with different colors for different sender, length indication for time durations and direct popup help for the content of each item:

You see where you are, what was and after and who was the sender. Even threads you not receive are visible. A easy navigation between the emails, and popup previous for each of the thread items.

Viola, what else could you want? Of course there are things can be improved by the way the basic idea and implementation is worth to take a look at....