Thursday, May 3, 2007

News from the .NET OR-Mapping-Front - OO-Database concerns

There is yet another player entering the game of OR-Mapping in the .NET-World: Persistor.NET. Like their competitor Genome they come from Austria.

The product looks like an implementation of an OO-Database that uses SQL-Server as backend.

I have not done a deep investigation of the product, but I am sceptical concerning the usefulness of the product for enterprise applications. The issues I have are the following:

  • The goal of the product is to shield the developer completely from persistence-concerns:
    • The developer can not define the DB-schema (resp. in their philosophy he does not have to care about the schema).
    • The mapper does not work on an existing schema.
    • The mapper generates (extends?) the schema on the fly.

The product basically implies, that when we are doing Object-Oriented development, all persistence-details are not a business-concern and therefore should be handled by the infrastructure. I don't agree with this opinion at least on two levels:

  1. From an architectural standpoint the persistence details are very much a business concern for business applications:
    • There is no logical separation of the Data-Storage-Layer and the Application Layer. There is no decoupling between those two layers: As soon as I have defined my object-structure, the DB-schema is also defined.
      But this logical separation is a crucial requirement for non-trivial business applications:
      • The persistent data is almost always the real asset of a company, not the applications that handle the data. The data often lives a lot longer than the applications on top of them. So it is a critical concern how the data is stored, and the schema plays a very important role here. Data-migration is a keyword that comes to mind.
      • The persistent data has often to be accessed in different ways and even different applications. Not all consumers of the data can or want to use the same logical (object) model to work with the data. Think about reporting, advanced queries, exports and imports…
  2. From a development standpoint it rings my alarm bells, when all persistence details are hidden in a black-box:
    • What if the application has to scale? Does the black-box scale as desired? What if not?
    • What if there are performance problems?
    • What if the black-box does not behave as I expect it?
    • What if I want to use the black-box for an exotic use-case? (Example: What happens If I have an existing application with existing data and then I introduce some new subclasses for an existing class? There are several ways to map inheritance in a relational DB, each has pros and cons, a tool cannot make the decision for me! What happens with existing data, when the schema changes because of the new inheritance-hierarchy?

So, I think for the big picture of an enterprise application, persistence and how it is realized is a very important business concern. A lot of effort should be put into that topic, mostly in the early phases of a project (architecture, design). It is also important to fix the persistence-details as early as possible in a project, since a strong fundament is always a good thing.

What I think OR-Mapping as all about, is to shield the implementation of business logic from persistence details. The developer does not have to be concerned about persistence. He should not have to have any knowhow about the persistence-backend. He should be able to work on a higher level of abstraction, with an object-model, that was designed and optimized for the specific application he is developing

Also the implementation of the business logic should be decoupled from the persistence-backend, so that the backend can be changed with minimal effort.

My above concerns are more or less applicable for all OO-Databases. In my opinion OO-Databases are not suited for the core data-storage of business applications. I see the following scenarios for OO-Databases.

  • Prototypes that need a data storage
  • Systems that do need an embedded data storage, but where the raw data is no business concern. Clients/Users of those systems do not mainly interact with the data in the system, but with the functionality of the system. (Control platforms, embedded systems, standalone desktop applications …)
  • Intermediate repositories of business-data for distributed business applications. For instance for disconnected smart-clients. The repositories hold a copy of a subset of the business-data and there is some kind of synchronization with the primary data-storage.

But back to Persistor.NET: I don't see how they place themselves on the market. Do they compete with full blown OR-Mappers like NHibernate, Genome or OpenAccess? I dont think so, according to my concerns above. On the other side there is competition from pure OO-Databases like db4o. All the features of Persistor.NET seem to be standard OO-Database features and the so called ‘Zero Configuration’, can be provided by db4o even better, because it does not rely on a SQL-Server as backend.

By the way: Ted Neward is writing some tutorials about db4o (Part 1, Part 2). In the tutorials he addresses also some general topics of object databases.

2 comments:

  1. Hi Jonas.
    What about OO-DB's as work-persistence solution?
    I see a great gap between enterprise-long-term storage and operational persistence.
    OO-DB's perform very well in OLTP and rapid record creation scenarios.
    Wheras they cannot compete in reporting and searching data.

    My preferred solution would be:

    OO-DB to save active data.
    Relational DB to store historical data.

    There is still a gap. And this gap is essential as you point it out very nice. It is an engineering process to design a schema for the valuable historical data. No machine should do this automatically.

    But it is overkill to burden developers with relational models for temporary data.
    And performance might be much better if the OO-DB acts as a cache as well.

    ReplyDelete
  2. Hi Jonas,
    Thanks for sharing your insight into Persistor.NET. In your posting you wonder where Persistor.NET places itself on the market. As one of the founders of 2Top and developer of Persistor.NET I would like to answer some of your questions. Let’s start with the easy and very clear part by stating where Persistor.NET cannot be used:

    - you cannot use Persistor.NET if you want to use an already existing DB schema. The DB schema is derived from the object model automatically. That’s why Persistor.NET is not an O/R Mapper - in fact there is only one model, the object oriented one, also in the database.

    - you cannot use Persistor.NET in distributed applications. There is no artificial Object Identity in the business model so object identity get’s lost crossing process- or machine-boundaries.

    We address all developers who want to see persistence as a component embedded in their applications. We agree that this might not be the typical use case when writing enterprise applications. Persistor.NET does not claim for being the silver bullet.

    What about the architectural standpoint:
    We agree, that logical separation of the Database and the Application can be a goal or is, in some situations, a must. But I think that both crucial requirements you list are satisfied. That is the reason why we use standard relational datastores in contrast to OO–databases or any proprietary format. If you want you can access the data also from other applications, do reporting, import or export data – it is just a relational DB-schema – like all other schemas. And if you are familiar with the object model you even don’t have to learn something new. Of course you are right: a DB schema designed by an DB admin is likely to look different. But when does this really matter?

    What about the development standpoint:
    You are right again: using a black box and having no chance to change anything inside can be a nightmare. But don’t you also agree that we are already living with a lot of black boxes?
    Let's discuss the performance aspect. Our experience is that tuning the database is necessary, has to be done by experts and that it is really hard work. We just can’t imagine that it can be done automatically. Any abstraction layer slows down by default but it also offers opportunities for optimization that can hardly be done by individual developers. In some sense it is similar to the time when compilers like FORTRAN came up and were used instead of coding assembler ( now you can guess my age ;-) ).

    Many developers did not trust the code generated by a compiler: “Code generated by a compiler can never be as efficient as code written with my experience!” they said. They were true and they are still true – but does it matter?

    So simplicity is our first design goal! Although simplicity cannot be applied in all situations we are sure that our concept is applicable to in many application.

    The second key concern is object orientation – completely and consistent. Persistor.NET claims to be able to store and retrieve any object graph. This is unique! Persistor.NET seamlessly integrates into the .NET Framework and in its notion of persistence.

    ReplyDelete

Related Posts Plugin for WordPress, Blogger...