Jena .NET framework to be re-released
After a period of being dormant, Jena .NET framework will be re-released today and will be available for download.
After a period of being dormant, Jena .NET framework will be re-released today and will be available for download.
Jena .NET 0.3 has now been released.
It includes not only the latest Jena 2.6.3 library released by the Jena working group, but also now TDB 0.8.7, which means you can get a persisted session, with SPARQL querying, up and running. More news on this soon.
I’ve just uploaded a latest screenshot of Linked Data Tools’ RDF Buddy software, including a model view and SPARQL query view. Further news will be available on a first release this month.
All tutorials in the Linked Data Tools Semantic Web Primer are now complete, and will be revised and improved over the coming months. A number of very positive comments from visitors have already been received, which is promising to see.
Further advanced topics, such as an article introducing the Turtle RDF serialization in addition to the RDF/XML serialization covered in the primer are planned, plus likely a further article on more advanced SPARQL queries.
I can see over the next couple of years the idea of storing data in an SQL database, and then retrieving the results of queries into datasets, becoming increasingly redundant and the idea of persistence-medium agnostic data models, which interact with an application dynamically, becoming increasingly important.
What do I mean by this? If you develop object oriented applications in either business or government, how often do you, regardless of storage medium, eventually end up with your data being stored in in-memory data objects with properties that you can read and write on a UI?
Well – why not forget about the whole idea of pulling data out of a relational DB via a dataset, then populating data objects and a UI then, when the user’s finished with the data, pushing the data back into a dataset before re-storing on a UI.
Why not just get the data into objects as and when required, and query those objects?
We’re already getting there. Technologies such as LINQ are already trying to find ways of making queries to data sources agnostic. So we have LINQ-to-XML, LINQ-to-SQL, LINQ-over-dataset.
In other words, just trying to get the persistence medium – the thing providing the data – abstracted so that we don’t care anymore. We just query in the same way.
Sound ideal? I’m really not so sure. Because eventually, I think in-memory data objects, populated on demand, are going to become the future with agnosticism as to where those data objects are located. And in addition – that the very names or details of those objects are going to become irrelevant – rather, only the meaning behind them. The semantics.
This morning I built a simple app to convert between the different RDF serialization formats (starting from any valid RDF format) using ARP via Jena .NET. You can select any RDF file, it parses the file, then displays it in the selected RDF format – RDF/XML, Turtle or N3. I’m also going to extend it to show a breakdown of the Jena model created from deserializing the RDF data (i.e. information on the statements, properties and classes/subclasses used to annotate the data) in a simple set of flat lists.
I’m also going to wrap the Jena .NET IKVM classes in ‘helper’ classes which take the APIs for the wrapped classes and make them convert between the java.io namespaces and the .NET equivalents in a bespoke fashion so that you no longer have to access any of the IKVM java.io classes directly to use Jena .NET – so you can forget about IKVM completely!
I’ll release it as a branded app free under the BSD licence. I’ll upload some screenshots soon and update this post.
UPDATE: Brief outline of RDF Buddy, which will be the title of the software under which this will be released, can be found here: RDF Buddy
I’ve today released the first version of the Jena .NET package at http://www.linkeddatatools.com/downloads.
The package includes the ported Jena library, plus IKVM, plus all in a visual studio solution that demonstrates the use of the library via ported editions of Andy Seaborne’s Jena tutorial classes.
I hope to release a neater installation/deployment soon with a toolkit and status monitor, and plan to release a Jena .NET server so that data can be distributed/connected to/queried rather like a SQL (but now SPARQL) database.
I’ve successfully resolved the serious performance issues described in my previous tests using IKVM and the jena test libraries.
As Jeroen Frijters described in comments on my previous post, he used the same code as published without such performance issues. I’ve now compiled with Visual Studio 2010 Express, in a console application, using the latest IKVM libraries to which fixes were applied. The results are overwhelmingly better and comparable to direct execution of the class loader via java.exe. Test results see below:
80.078 – direct execution using java.exe – 9294 tests OK
94.203 – execution via .NET console via IKVM libraries – 9294 tests OK
This is good enough for publishing the compiled Jena library with BSD license for .NET, including a library of examples to get the C# .NET developer interested in using the Jena library. Update will follow very soon.
I’ve run through some tests on the IKVM-compiled Jena library I created yesterday using a .NET UI application running the standard test package from a high priority thread. Code is as follows:
this.OnStatusOutput(“Instantiate test package…”);
junit.textui.TestRunner testRunner = new junit.textui.TestRunner();
DateTime startDateTime = DateTime.Now;
this.OnStatusOutput(“Start the test…”);
testRunner.start(new string[] { “com.hp.hpl.jena.test.TestPackage” });
this.OnStatusOutput(“Completed. Took ” + (DateTime.Now – startDateTime).ToString());
To get a feel for how long the test should take (give or take the differing environments), I executed the test.bat batch file included with the Jena library. Results were 80.257 seconds for the 9294 unit tests within the com.hp.hpl.jena.test.TestPackage test package.
After referencing the IKVM assemblies, and running the same script test package using my IKVM compiled .NET assembly, the results were very disappointing.
Although no tests reported a failure (and this is an improvement on any previous attempts that I have seen documented online), the same test of 9294 test units took 2,548.625 seconds to complete – over 30 times longer.
I don’t know if this is down to an IKVM performance hit, but it seems abnormally large and may be due to some other, initially unknown, factors.
I will update with any progress on this as I try and work out what is causing these performance issues.
I have today converted the Jena 2.6.2 libraries successfully to a .NET assembly. Referencing in Visual Studio 2010 appears to expose the “jena” namespace fine with no initial problems.
I will test probably over the next few days using the supplied RDFs to see that the conversion was fully successful and report back. I used the following batch file:
@echo off
set c:\ldt\ikvm-0.42.0.3\bin
C:
cd c:\ldt\jena-2.6.2\lib
echo Compile LDT.Core.Jena.dll
ikvmc -target:library xercesImpl-2.7.1.jar wstx-asl-3.2.9.jar stax-api-1.0.1.jar
slf4j-log4j12-1.5.6.jar slf4j-api-1.5.6.jar log4j-1.2.13.jar junit-4.5.jar
jena-2.6.2.jar jena-2.6.2-tests.jar iri-0.7.jar icu4j-3.4.4.jar arq-2.8.1.jar
-out:LDT.Core.Jena.dll
Obviously, to avoid any dependencies issues and to go for the simplest first option, I have simply compiled all the Jena libraries into a single assembly.
One thing I would note is that the resulting assembly is no less than 15MB in size, and I’m concerned that, given I won’t be using some of the libraries Jena uses in my own .NET native code but my own equivalents (e.g. for logging), this is really rather large. Having said that, it may not be a serious issue and can be broken down into components and that is what I will do next.
If I successfully break the components down, I will maintain a binary repository at the .NET Semantic Web downloads section of LinkedDataTools, of course released under the BSD license.