AOD Americati Object Database

Intro

AOD is a Java library which implements an in-memory database with object orientation employing a journaling backend. It is directed to applications requiring only a "small" dataset (i.r.t. memory size.)

The structure is defined by the object tree, starting from an "initial map". If needed, a relational model may be built upon it.

The primary requirement is an application which needs to persist objects of the following "persistable" types: Maps, Lists, arrays and JavaBean POJOs (with some restrictions, see details below.)

The second requirement is that all the data must fit in the process memory. The "queries" are simple java object (or containers) query operations. Only on mutation the persistence is activated.

A mayor difference with traditional database programming corresponds to multithreading code: the database is usually considered (implicitly) "thread safe", so many threads deal with it without having to consider the synchronization of the executing threads; in AOD the database objects are also thread safe, with some specific exceptions.

Advantages

Simplify infrastructure and personal requirements
Simplify the data structure updates as the software is evolved (no DML/DDL separation)
Avoid the complexity of O/R mapping: AOD may be considered as doing an O/J mapping from objects to journaled disk files
Object oriented syntax, providing a hopefully simpler programming model as compared with dealing with pure relational databases
In-memory, which is fast, leveraging the lower costs of the media
Native Java, simplifying the architecture and dependencies

Disadvantages

There is no support for native database transactions: the rollback steps must be manually performed as needed
The in-memory feature is fast, but prohibitive for too big data sets

Using AOD

Database setup

AOD does need an empty directory and a "database name". AOD will not create the database directory.

This information must be provided in a AODCfg object:

File dir = getConfiguredDatabaseDirectory();
String databaseName = "mytest";
AODCfg cfg = AODCfg.create(dir, databaseName);

The name is used as a prefix for the datafiles to be created and managed by AOD.

It is possible to have several databases inside the same directory, modulo distinct names.

Operation

The AOD.startDatabase(AODCfg cfg) static method starts the database engine; it loads all the database information to memory and provides an AODEngine instance, ready for working. The next step is getting a reference to the "initial map" using the getInitialMap() method. This is a Map implementation which is "persisted", which means that any (valid) inserted object will be saved in disk for further recovery.

Note	When the database engine is started for the first time, the minimal needed "datafiles" are automatically created in order to be ready for operation.

The database should be closed with the shutdown() method when no longer needed.

Note

When the AODEngine is started, a "check file" named as the database with extension .wrk is created. This is used to avoid the simultaneous startup of another AODEngine pointing to the same database (which would corrupt the database.) This file is automatically removed at shutdown time. In the event of an abnormal shutdown of the database, this file will not be removed and will not allow the database to start. In this event, care must be taken to verify that no instance of the database is running, then the file must be manually removed before restarting the database.

Result<AODEngine,AODERR> rEngine = AOD.startDatabase(cfg);
assertTrue(rEngine.hasData());
AODEngine engine = rEngine.getData();
Map<String,Object> initialMap = engine.getInitialMap();

The main rules:

Read-only operations on persisted objects may be executed in parallel by any number of threads, but a single thread is able to write.
Adding objects to a persisted container (like the "initial map") does create a "persisted" version of such object, which may be later extracted from the container.
Any change to these "persisted objects" will also be persisted.
All the persisted objects must be chained (maybe indirectly) to the initial map; else, they will not be available after the next database startup.

Persisting objects in the "initial map"

As said, any "persistable object" may be put in the initial map:

AODEngine engine = rEngine.getData();
Map<String,Object> initialMap = engine.getInitialMap();
initialMap.put("x", "A String is persistable");

The AODEngine.get() and AODEngine.put() methods are restricted to String keys, and abbreviate the "initial map" operations:

AODEngine engine = rEngine.getData();
engine.put("x", "A String is persistable");

Unlike the initial map, the keys must be of String, but also provide the concept of "path" by providing a text with the form c1/c2/… where c1, c2, etc. are path components which correspond to intermediate map keys starting from c1 which represents a key in the initial map. For example:

AODEngine engine = rEngine.getData();
engine.put("x/y/z", "Hello!");

Will add the x key to the initial map, associated to a new HashMap. Inside it, the key y will be associated to another HashMap; finally, inside the later, the key z will be associated to the Hello! String. The intermediate maps will be created as needed, but if a non map object is found then an exception is thrown.

Writing

In order to reserve the "write permission" for a sequence of read/write operations, the following pattern may be employed:

try(AODWriter w = engine.getWriter()) {
	// do read/write operations
}

This block guarantees that the database is not modified by other threads. Note that this is not mandatory: any writer operation implies the capture of this permission but a sequence of unguarded writes may "see" distinct states of the database because of concurrent writes.

Multiple databases

It is totally okay to add an already persisted object (in an AOD database) into another target AOD database; in this case a new object will be created and persisted in the target database. This pattern may be used -for example- to enable the storage of historical information: it would be too wasteful to load all the available data into a single AOD instance (since everything resides in memory), so a compromise may be achieved partitioning the information into distinct databases (for example, one database for a chunk of annual records.)

Note that another alternative for historical information is provided by "Object Logs" (see below), specially when the readings are infrequent and the speed is not critical.

Persistable objects

Primitive numeric types and its boxed versions (including char/Character and bool/Boolean)
Objects of classes: String, BigDecimal and BigInteger
Other JavaBean POJOs whose attributes are in this list
If enabled, arrays whose contents are in this list
Immutable objects registered by the running AODSerializable as serializable

Note

If array persistence is enabled (see the enable-arrays setting), when a persisted array’s individual element is modified (that is applying X[Z]=V), the change will NOT be persisted. The arrays must be totally replaced with a new version which will be persisted in its place. As an (rather slower) alternative, consider persisting a List and use its set() method.

Persisted Maps

Are objects which satisfy the Map interface or the NavigableMap interface. For example, a HashMap will have a persisted version which satisfy Map, while a TreeMap will have a persisted version which satisfy NavigableMap.

Those maps do not allow null keys nor null values. A put() with a null value will silently remove the corresponding entry; a null key will throw an exception. For NavigableMap the keys must be immutable Java Comparable objects and no Comparator must be in use.

As mentioned, when a valid object is added to a persisted map (or any other persisted container) then a new persisted version of the former is created an effectively added to the container. The persisted version may be extracted using the usual container methods.

It is okay to add an already persisted object to a container; in this case no new persisted object will be created:

Map<String,Object> initialMap = engine.getInitialMap();
MyBean x = new MyBean();
// add the bean to the persisted map:
initialMap.put("some key", x);
// get persisted version
MyBean p_x = initialMap.get("some key");
// the persisted version is a new object which
// uses a subclass of MyBean
assertFalse(x == p_x);
// put again the persisted bean:
initialMap.put("other key", p_x);
// get persisted version using the new key:
MyBean p_xx = initialMap.get("other key");
// no new object was created: p_xx is the same as p_x
assertTrue(p_x == p_xx);
// at the end of the day:
engine.shutdown();

Specialized Map implementations may lose its specific power in their persisted version. For example, a LinkedHashMap object will not preserve the insertion order of its entries.

Note

It is a good practice to employ some naming convention to easily distinguish from just in-memory objects and persisted objects. The previous sample used a p_ prefix for this reason. To dynamically know if an object is persisted in a database, test with AODEngine.isPersisted(Object) (another test is instanceof AODPersisted, but this only informs whether the object is associated to some database.)

Note	The maps are persisted using the classes `AODNavMap` and `AODMap` for maps which implement `NavigableMap` and `Map` respectively. The initial map is currently an `AODMap` so it is not navigable.

Persisted Sets

Objects which implement NavigableSet or Set will be persisted (as AODNavSet and AODSet respectively.) This implementation does not allow the insertion of a null element.

Persisted Lists

Any persisted Java List object will get a persisted version implemented by an AODList object. Internally, it is built in terms of a Java NavigableMap.

Performance considerations

The random extraction of its elements is O(log) and not O(1) like as in ArrayList. Also, there are two slow operations which imply a full copy of the list: the removal of elements (at any position) and the addition of an element in a non final position. This is a compromise made to allow for a reasonable fast operation and predictable iterators. At some time we considered CopyOnWriteArrayList but (in our opinion) its benefits require a too restricted application context.

Persisted POJOs

The POJOs must implement the JavaBean conventions. Their acceptable attributes are the valid persistable objects.

Example:

Persist a POJO, get its persisted version and modify it:

MyBean x = new MyBean();
// here go setters fox x ...
x.setSomeProp(1);
// add the bean to the persisted map:
initialMap.put("some key", x);
// get persisted version
MyBean y = initialMap.get("some key");
// reset a property, the new value will also be persisted:
y.setSomeProp(2);

Note	POJO fields annotated with `@AODIgnore` are ignored for persistence (they remain with their default values.) This is handy when such field is not needed to be persisted at all: it reduces disk consumption and allows for attributes with non-persistable types.

Multithreaded code

The objectives:

Avoid damaging the data (making inconsistent states)
Avoid deadlocks
Avoid extracting invalid data
Support good performance

As far as we know, these objectives are impossible to satisfy simultaneously in a multithreaded application, so compromises are in place (like the imposed by the "isolation levels" in RDBMS systems.)

We consider the most dangerous data inconsistency resides in creating an in-memory version of the data which after reloading results in different contents. This may happen if the modification to the data structures and the disk writes are not synchronized between threads. That is the reason we employ the AODWriter permission which runs in a synchronized thread.

Another source of inconsistency may happen even in a single thread if a disk write is sent to the journal but the in-memory version does not allow such write, and throws some exception: it is expected that a future database load process will throw the same exception.

To avoid extracting invalid data we rely in weakly consistent containers as provided by Java’s ConcurrentHashMap which is employed for maps, sets and lists.

Note that the synchronization of the writer methods does not imply any kind of transactional isolation nor transactional behavior.

Unpersisted versions

The AODEngine.cloneUnpersist(Object o) allows the creation of an "unpersisted" clone replicating the persisted provided object. If the provided object is not persisted, then this method simply creates a "clone".

The type of the provided object must satisfy the same rules as for providing objects. Also, for Set, Map and List interface implementing objects, the created version will be the original container class provided at the persistence time.

Custom user types

There are non-persistable types which may be needed by the application. For example, a Java timestamp object field is not a JavaBean and can’t be persisted by AOD. A direct solution is to redesign the structure to store "simpler" fields which represent the time (for example, an integer for the day of month, another for the month, etc.)

A more elaborated way is to provide AOD with a custom serializer and notify the presence of the new persistable type.

AOD employs an object implementing AODSerializer to serialize the "immutable" objects (like numeric constants, strings, etc.) There are two implementations provided by AOD: the TextSerializer and the BinSerializer. The first is oriented to textual representation of the objects; the later leverages the Java’s ObjectStream machinery.

Note	The `TextSerializer` in turn relies on the experimental `Serializer` class provided by the AJU project.

Example: timestamps

In order to support the persistence of OffsetDateTime objects, we provide a new AODSerializer by extending the com.americati.aod.impl.BinSerializer:

package org.acme;

import java.time.OffsetDateTime;
import java.util.HashSet;
import java.util.Set;

import com.americati.aod.impl.BinSerializer;

public class TimeBinSerializer extends BinSerializer {

	@Override
	public Set<Class<?>> getImmutableTypes() {
		Set<Class<?>> ans = new HashSet<Class<?>>(super.getImmutableTypes());
		ans.add(OffsetDateTime.class);
		return ans;
	}
}

In the getImmutableTypes() method we add our class OffsetDateTime.class to the standard set of immutable types (basically the primitive autoboxed classes, including String, BigDecimal and BigInteger.)

Remember to provide this serializer to the AOD configuration at database initialization time:

aodCfg.setSerializer("org.acme.TimeBinSerializer");

Object Logs

These are ever growing sequences of objects which are stored on disk, but not in memory. The objects must be of a persistable type as previously described, but no persisted version version will be available in memory.

To extract the saved objects an Eater must be provided to the read() method in order to process the loaded (unpersisted) objects one at a time, by doing a total disk read (which may be slow depending on the size.) Note that unlike the databases, the extracted contents will not remain in memory after read.

Note	The extracted objects may be mutated by the `Eater` without further effect (no further persistence will occur.) New instances will be created for further calls to `read()`.

For example, for storing and printing some strings:

Result<AODLog<String>,AODERR> r2 = AOD.startLog(cfg);
assertTrue(r2.hasData());
AODLog<String> e2 = r2.getData();
e2.persist("Krat");
e2.persist("Krit");
e2.persist("GOR");
e2.persist("Tang");
e2.shutdown();

// add another string
Result<AODLog<String>,AODERR> r3 = AOD.startLog(cfg);
assertTrue(r3.hasData());
AODLog<String> e3 = r3.getData();
e3.persist("ELE");

// read and print the stored strings
Optional<AODERR> r = e3.read(s->{
	System.out.println(s);
});
assertFalse(r.isPresent()); // no exceptions were thrown

Note that the Eater will be executed by the same calling thread of read(). If the Eater throws an exception, it counts as a "load error" which may interrupt the process depending on the max-load-errors configuration setting.

Relational model facilities

AOD supports the relational model for domains where it is needed. The relational tables are abstracted by the AODRelTab interface which is based on the following concepts:

The table contents (rows) are persisted JavaBeans of a single type
The table has a Java Comparable primary key which usually is a row class attribute, or a new class which combines some of the row class attributes
The primary key can be obtained from an object which implements the AODIndexer class

There is a GenIndexer class which simplifies the implementation of AODIndexer instances. The AODIndexer instances can’t be created from anonymous classes.

Note	Internally, the tables are backed by a `ConcurrentSkipListMap`.

There are also indexes abstracted by the AODRelIndex interface (which is also implemented by AODRelTab as a special case. Again, the index key must be generated from an AODIndexer instance.

Finally, there are subsets from the table contents which are referred as views, abstracted by the base AODRelView interface, which provides several query oriented methods.

The AODRelTab tables support the usual insert, update and delete methods, and the creation of indexes.

Also, AODRelIndex objects (which include tables) allow the establishment of foreign keys, referring to a "parent" index in another table.

Saving configuration settings

The AODCfg configuration object may be stored in the database directory in a file named name.aod (which have the standard Java Properties format.)

On disk, the settings are denoted as follows:

df-max-length

When the running datafile achieves this value (in bytes) then create the next one. Defaults to "67108864" (64 megabytes). Use zero to disable this criteria.

df-max-old

when the running datafile achieves this antiquity then create the next one; the value is the text representation of a Java’s Duration object.

df-max-cnt

when the running datafile achieves this number off non interrupted record writes then create the next one. Defaults to "0" (disabled.)

df-check-cnt

how frequent to check for new datafile criteria in terms of record writes. Defaults to "256"; that is, the previous checks are done every 256 writes in the journal. The new datafile is created if at least one of the previous criteria is satisfied.

comp-auto

if "true" or "1" (the default) it will enable the automatic compression and the following two settings, which must be simultaneously valid for the compression to actually take place; "false" or "0" will disable the compression (it is expected that another process fires it at some interval.)

comp-cron

Crontab-style expression for time to fire the compression (default is "0 0 0 2 * *", that is, every day at 2 AM.)

comp-df-min

Minimum number of existing datafiles to fire the compression (minimum and default = "3".)

serializer

Class name which implements AODSerializer. Defaults to com.americati.aod.impl.TextSerializer; also provided is com.americati.aod.impl.BinSerializer. Note that this parameter can’t only be set at database initialization time.

enable-arrays

By default is false. If true, allow the persistence of arrays. We consider a relatively dangerous feature since the changes applied to a persisted array are not detected by AOD (because of the JVM limitations.) See the Persistable objects section for more details.

max-load-errors

Defaults to zero. A non negative value signals the maximum number of acceptable "logical" errors happening during the data load process (for example, a de-serialization problem.) A negative value disables this check. Note that a data corruption at the journaling level can’t be ignored by this mechanism: the data files must be repaired for the database to start.

flush-interval

In milliseconds, the interval for flushing the writer buffer. If zero (the default), then the writer is unbuffered. This automatic facility is designed to mitigate the possibility of journal corruption due to system crash.

TODO

A tool for reconstructing partially damaged datafiles
A command line tool as in SQL databases