Intro
AOD is a Java library which implements an in-memory database with object orientation employing a journaling backend. It is directed to applications requiring only a "small" dataset (i.r.t. memory size.)
The structure is defined by the object tree, starting from an "initial map". If needed, a relational model may be built upon it.
The primary requirement is an application which needs to persist objects of the following "persistable" types: Maps, Lists, arrays and JavaBean POJOs (with some restrictions, see details below.)
The second requirement is that all the data must fit in the process memory. The "queries" are simple java object (or containers) query operations. Only on mutation the persistence is activated.
A mayor difference with traditional database programming corresponds to multithreading code: the database is usually considered (implicitly) "thread safe", so many threads deal with it without having to consider the synchronization of the executing threads; in AOD the database objects are also thread safe, with some specific exceptions.
Advantages
-
Simplify infrastructure and personal requirements
-
Simplify the data structure updates as the software is evolved (no DML/DDL separation)
-
Avoid the complexity of O/R mapping: AOD may be considered as doing an O/J mapping from objects to journaled disk files
-
Object oriented syntax, providing a hopefully simpler programming model as compared with dealing with pure relational databases
-
In-memory, which is fast, leveraging the lower costs of the media
-
Native Java, simplifying the architecture and dependencies
Disadvantages
-
There is no support for native database transactions: the rollback steps must be manually performed as needed
-
The in-memory feature is fast, but prohibitive for too big data sets
Using AOD
Database setup
AOD does need an empty directory and a "database name". AOD will not create the database directory.
This information must be provided in a AODCfg
object:
File dir = getConfiguredDatabaseDirectory();
String databaseName = "mytest";
AODCfg cfg = AODCfg.create(dir, databaseName);
The name
is used as a prefix for the datafiles to be created and
managed by AOD.
It is possible to have several databases inside the same directory, modulo distinct names.
Operation
The AOD.startDatabase(AODCfg cfg)
static method starts the
database engine; it loads all the database information to memory and
provides an AODEngine
instance, ready for working. The next
step is getting a reference to the "initial map" using the getInitialMap()
method. This is a Map
implementation which is "persisted", which means
that any (valid) inserted object will be saved in disk for further
recovery.
Note
|
When the database engine is started for the first time, the minimal needed "datafiles" are automatically created in order to be ready for operation. |
The database should be closed with the shutdown()
method when no longer
needed.
Note
|
When the AODEngine is started, a "check file" named as the
database with extension .wrk is created. This is used to avoid
the simultaneous startup of another AODEngine pointing to the
same database (which would corrupt the database.) This file is
automatically removed at shutdown time. In the event
of an abnormal shutdown of the database, this file will not be
removed and will not allow the database to start. In this event,
care must be taken to verify that no instance of the database
is running, then the file must be manually removed before
restarting the database.
|
Result<AODEngine,AODERR> rEngine = AOD.startDatabase(cfg);
assertTrue(rEngine.hasData());
AODEngine engine = rEngine.getData();
Map<String,Object> initialMap = engine.getInitialMap();
-
Read-only operations on persisted objects may be executed in parallel by any number of threads, but a single thread is able to write.
-
Adding objects to a persisted container (like the "initial map") does create a "persisted" version of such object, which may be later extracted from the container.
-
Any change to these "persisted objects" will also be persisted.
-
All the persisted objects must be chained (maybe indirectly) to the initial map; else, they will not be available after the next database startup.
As said, any "persistable object" may be put in the initial map:
AODEngine engine = rEngine.getData();
Map<String,Object> initialMap = engine.getInitialMap();
initialMap.put("x", "A String is persistable");
The AODEngine.get()
and AODEngine.put()
methods are restricted to
String
keys, and abbreviate the "initial map" operations:
AODEngine engine = rEngine.getData();
engine.put("x", "A String is persistable");
Unlike the initial map, the keys must be of String
, but also provide
the concept of "path" by providing a text with the form c1/c2/…
where
c1
, c2
, etc. are path components which correspond to intermediate
map keys starting from c1
which represents a key in the initial map. For
example:
AODEngine engine = rEngine.getData();
engine.put("x/y/z", "Hello!");
Will add the x
key to the initial map, associated to a new
HashMap
. Inside it, the key y
will be associated to another
HashMap
; finally, inside the later, the key z
will be
associated to the Hello!
String. The intermediate maps will
be created as needed, but if a non map object is found then
an exception is thrown.
In order to reserve the "write permission" for a sequence of read/write operations, the following pattern may be employed:
try(AODWriter w = engine.getWriter()) {
// do read/write operations
}
This block guarantees that the database is not modified by other threads. Note that this is not mandatory: any writer operation implies the capture of this permission but a sequence of unguarded writes may "see" distinct states of the database because of concurrent writes.
It is totally okay to add an already persisted object (in an AOD database) into another target AOD database; in this case a new object will be created and persisted in the target database. This pattern may be used -for example- to enable the storage of historical information: it would be too wasteful to load all the available data into a single AOD instance (since everything resides in memory), so a compromise may be achieved partitioning the information into distinct databases (for example, one database for a chunk of annual records.)
Note that another alternative for historical information is provided by "Object Logs" (see below), specially when the readings are infrequent and the speed is not critical.
-
Primitive numeric types and its boxed versions (including char/Character and bool/Boolean)
-
Objects of classes: String, BigDecimal and BigInteger
-
Other JavaBean POJOs whose attributes are in this list
-
If enabled, arrays whose contents are in this list
-
Immutable objects registered by the running
AODSerializable
as serializable
Note
|
If array persistence is enabled (see the enable-arrays setting), when a
persisted array’s individual element is modified (that is applying X[Z]=V ), the
change will NOT be persisted. The arrays must be totally replaced with
a new version which will be persisted in its place. As an (rather slower)
alternative, consider persisting a List and use its set() method.
|
Persisted Maps
Are objects which satisfy the Map
interface or the NavigableMap
interface. For example, a HashMap
will have a persisted version
which satisfy Map
, while a TreeMap
will have a persisted version
which satisfy NavigableMap
.
Those maps do not allow null keys nor null values. A put()
with a
null value will silently remove the corresponding entry; a null key
will throw an exception. For NavigableMap
the keys must be immutable Java Comparable
objects and no Comparator
must be in use.
As mentioned, when a valid object is added to a persisted map (or any other persisted container) then a new persisted version of the former is created an effectively added to the container. The persisted version may be extracted using the usual container methods.
It is okay to add an already persisted object to a container; in this case no new persisted object will be created:
Map<String,Object> initialMap = engine.getInitialMap();
MyBean x = new MyBean();
// add the bean to the persisted map:
initialMap.put("some key", x);
// get persisted version
MyBean p_x = initialMap.get("some key");
// the persisted version is a new object which
// uses a subclass of MyBean
assertFalse(x == p_x);
// put again the persisted bean:
initialMap.put("other key", p_x);
// get persisted version using the new key:
MyBean p_xx = initialMap.get("other key");
// no new object was created: p_xx is the same as p_x
assertTrue(p_x == p_xx);
// at the end of the day:
engine.shutdown();
Specialized Map
implementations may lose
its specific power in their persisted version. For example,
a LinkedHashMap
object will not preserve the insertion order of its entries.
Note
|
It is a good practice to employ some naming convention to easily distinguish
from just in-memory objects and persisted objects. The previous sample used
a p_ prefix for this reason. To dynamically know if an object is persisted in
a database, test with AODEngine.isPersisted(Object) (another test is instanceof AODPersisted , but
this only informs whether the object is associated to some database.)
|
Note
|
The maps are persisted using the classes AODNavMap and AODMap for
maps which implement NavigableMap and Map respectively. The initial
map is currently an AODMap so it is not navigable.
|
Persisted Sets
Objects which implement NavigableSet
or Set
will be persisted (as
AODNavSet
and AODSet
respectively.) This implementation does not
allow the insertion of a null element.
Persisted Lists
Any persisted Java List
object will get a persisted version implemented
by an AODList
object. Internally, it is built in terms of a Java NavigableMap
.
The random extraction of its elements is O(log) and not O(1) like
as in ArrayList
. Also,
there are two slow operations which imply a full copy of the list: the
removal of elements (at any position) and the addition of an element
in a non final position. This is a compromise made to allow for
a reasonable fast operation and predictable iterators. At some
time we considered CopyOnWriteArrayList
but (in our opinion) its
benefits require a too restricted application context.
Persisted POJOs
The POJOs must implement the JavaBean conventions. Their acceptable attributes are the valid persistable objects.
Persist a POJO, get its persisted version and modify it:
MyBean x = new MyBean();
// here go setters fox x ...
x.setSomeProp(1);
// add the bean to the persisted map:
initialMap.put("some key", x);
// get persisted version
MyBean y = initialMap.get("some key");
// reset a property, the new value will also be persisted:
y.setSomeProp(2);
Note
|
POJO fields annotated with @AODIgnore are ignored for persistence (they remain with their
default values.) This is handy when such field is not needed to be persisted at all: it
reduces disk consumption and allows for attributes with non-persistable types.
|
Multithreaded code
The objectives:
-
Avoid damaging the data (making inconsistent states)
-
Avoid deadlocks
-
Avoid extracting invalid data
-
Support good performance
As far as we know, these objectives are impossible to satisfy simultaneously in a multithreaded application, so compromises are in place (like the imposed by the "isolation levels" in RDBMS systems.)
We consider the most dangerous data inconsistency resides in
creating an in-memory version of the data which after
reloading results in different contents. This may happen
if the modification to the data structures and the
disk writes are not synchronized between threads. That
is the reason we employ the AODWriter
permission which
runs in a synchronized thread.
Another source of inconsistency may happen even in a single thread if a disk write is sent to the journal but the in-memory version does not allow such write, and throws some exception: it is expected that a future database load process will throw the same exception.
To avoid extracting invalid data we rely in weakly consistent
containers as provided by Java’s ConcurrentHashMap
which is
employed for maps, sets and lists.
Note that the synchronization of the writer methods does not imply any kind of transactional isolation nor transactional behavior.
Unpersisted versions
The AODEngine.cloneUnpersist(Object o)
allows the creation of an "unpersisted" clone
replicating the persisted provided object. If the provided object is not persisted,
then this method simply creates a "clone".
The type of the provided object must satisfy the same rules as for providing objects. Also,
for Set
, Map
and List
interface implementing objects, the created version will be the
original container class provided at the persistence time.
Custom user types
There are non-persistable types which may be needed by the application. For example, a Java timestamp object field is not a JavaBean and can’t be persisted by AOD. A direct solution is to redesign the structure to store "simpler" fields which represent the time (for example, an integer for the day of month, another for the month, etc.)
A more elaborated way is to provide AOD with a custom serializer and notify the presence of the new persistable type.
AOD employs an object implementing AODSerializer
to serialize the "immutable"
objects (like numeric constants, strings, etc.) There are two implementations
provided by AOD: the TextSerializer
and the BinSerializer
. The first
is oriented to textual representation of the objects; the later leverages
the Java’s ObjectStream
machinery.
Note
|
The TextSerializer in turn relies on the experimental Serializer class
provided by the AJU project.
|
In order to support the persistence
of OffsetDateTime
objects, we provide a new AODSerializer
by
extending the com.americati.aod.impl.BinSerializer
:
package org.acme;
import java.time.OffsetDateTime;
import java.util.HashSet;
import java.util.Set;
import com.americati.aod.impl.BinSerializer;
public class TimeBinSerializer extends BinSerializer {
@Override
public Set<Class<?>> getImmutableTypes() {
Set<Class<?>> ans = new HashSet<Class<?>>(super.getImmutableTypes());
ans.add(OffsetDateTime.class);
return ans;
}
}
In the getImmutableTypes()
method we add our class OffsetDateTime.class
to the standard set of immutable types (basically the primitive
autoboxed classes, including String
, BigDecimal
and BigInteger
.)
Remember to provide this serializer to the AOD configuration at database initialization time:
aodCfg.setSerializer("org.acme.TimeBinSerializer");
Object Logs
These are ever growing sequences of objects which are stored on disk, but not in memory. The objects must be of a persistable type as previously described, but no persisted version version will be available in memory.
To extract the saved objects an Eater
must be provided to
the read()
method in order to process the loaded (unpersisted) objects
one at a time, by doing a total disk read (which may be slow depending
on the size.) Note that unlike the databases, the extracted contents
will not remain in memory after read.
Note
|
The extracted objects may be mutated by the Eater without further
effect (no further persistence will occur.) New instances will be
created for further calls to read() .
|
For example, for storing and printing some strings:
Result<AODLog<String>,AODERR> r2 = AOD.startLog(cfg);
assertTrue(r2.hasData());
AODLog<String> e2 = r2.getData();
e2.persist("Krat");
e2.persist("Krit");
e2.persist("GOR");
e2.persist("Tang");
e2.shutdown();
// add another string
Result<AODLog<String>,AODERR> r3 = AOD.startLog(cfg);
assertTrue(r3.hasData());
AODLog<String> e3 = r3.getData();
e3.persist("ELE");
// read and print the stored strings
Optional<AODERR> r = e3.read(s->{
System.out.println(s);
});
assertFalse(r.isPresent()); // no exceptions were thrown
Note that the Eater
will be executed by the same calling
thread of read()
. If the Eater
throws an exception, it
counts as a "load error" which may interrupt the process
depending on the max-load-errors
configuration setting.
Relational model facilities
AOD supports the relational model for domains where
it is needed. The relational tables are abstracted
by the AODRelTab
interface which is based on the following
concepts:
-
The table contents (rows) are persisted JavaBeans of a single type
-
The table has a Java Comparable primary key which usually is a row class attribute, or a new class which combines some of the row class attributes
-
The primary key can be obtained from an object which implements the
AODIndexer
class
There is a GenIndexer
class which simplifies the implementation of
AODIndexer
instances. The AODIndexer
instances can’t be created
from anonymous classes.
Note
|
Internally, the tables are backed by a ConcurrentSkipListMap .
|
There are also indexes abstracted by the AODRelIndex
interface (which
is also implemented by AODRelTab
as a special case. Again, the index
key must be generated from an AODIndexer
instance.
Finally, there are subsets from the table contents which are
referred as views, abstracted by the base AODRelView
interface,
which provides several query oriented methods.
The AODRelTab
tables support the usual insert, update and delete
methods, and the creation of indexes.
Also, AODRelIndex
objects (which include tables) allow
the establishment of foreign keys, referring to a "parent"
index in another table.
Saving configuration settings
The AODCfg
configuration object may be stored in the
database directory in a file named name.aod
(which have the standard Java Properties format.)
On disk, the settings are denoted as follows:
df-max-length
When the running datafile achieves this value (in bytes) then create the next one. Defaults to "67108864" (64 megabytes). Use zero to disable this criteria.
df-max-old
when the running datafile achieves this
antiquity then create the next one; the value is the
text representation of a Java’s Duration
object.
df-max-cnt
when the running datafile achieves this number off non interrupted record writes then create the next one. Defaults to "0" (disabled.)
df-check-cnt
how frequent to check for new datafile criteria in terms of record writes. Defaults to "256"; that is, the previous checks are done every 256 writes in the journal. The new datafile is created if at least one of the previous criteria is satisfied.
comp-auto
if "true" or "1" (the default) it will enable the automatic compression and the following two settings, which must be simultaneously valid for the compression to actually take place; "false" or "0" will disable the compression (it is expected that another process fires it at some interval.)
comp-cron
Crontab-style expression for time to fire the compression (default is "0 0 0 2 * *", that is, every day at 2 AM.)
comp-df-min
Minimum number of existing datafiles to fire the compression (minimum and default = "3".)
serializer
Class name which implements AODSerializer
. Defaults to
com.americati.aod.impl.TextSerializer
; also provided is
com.americati.aod.impl.BinSerializer
. Note that this parameter
can’t only be set at database initialization time.
enable-arrays
By default is false
. If true, allow the persistence of
arrays. We consider a relatively dangerous feature since
the changes applied to a persisted array are not
detected by AOD (because of the JVM limitations.) See
the Persistable objects
section for more details.
max-load-errors
Defaults to zero. A non negative value signals the maximum number of acceptable "logical" errors happening during the data load process (for example, a de-serialization problem.) A negative value disables this check. Note that a data corruption at the journaling level can’t be ignored by this mechanism: the data files must be repaired for the database to start.
flush-interval
In milliseconds, the interval for flushing the writer buffer. If zero (the default), then the writer is unbuffered. This automatic facility is designed to mitigate the possibility of journal corruption due to system crash.
TODO
-
A tool for reconstructing partially damaged datafiles
-
A command line tool as in SQL databases