Updated for version 1.20pl71.

Introduction

AFT provides a secure and flexible file transfer infrastructure for heterogeneous environments which work in a cyclic regime and need to automate its operations.

The links are TCP sockets and by default the transmitted data is encrypted with TLS.

Basic Concepts

AFT works in two basic roles: hub or agent. The hubs "take the control" of a number of agents. The hubs implement file transfers in three modes:

  1. GET for extracting files from a "source agent"

  2. PUT for pushing files into a "destination agent"

  3. A2A (agent to agent) for extracting files from a "source agent" (to a temporary directory) and then push such files into a "destination agent"

The hubs and the agents allow the configuration of user defined transfers (in any working mode) identified by a simple label or "id".

Hubs transfers in GET mode are identified by [hub-get:id] configuration sections, for PUT mode by [hub-put:id] sections, and the A2A mode by [hub-a2a:id] sections.

For a "GET hub transfer" there must be a corresponding [agent-src:id] section in the agent configuration; for a "PUT hub transfer" does correspond a [agent-dst:id] agent configuration; finally, for an "agent to agent hub transfer" two agents with [agent-src:id] and [agent-dst:id] must be configured.

HUB-GET    <=====    AGENT-SRC

HUB-PUT    =====>    AGENT-DST

AGENT-SRC  =====>    HUB-A2A     =====>  AGENT-DST

The communication is encrypted by default using TLS. Both the global agent and the hub’s file transfer sections support encryption key settings which must be manually synchronized by the administrator.

In the design of AFT, it was envisioned that a hub instance will be in communication with several associated agents; but nothing precludes the user to setup several pairs of hub/agent instances for independent file transfers, or agent instances attending more than one hub, even simultaneously.

Running AFT

Step 1

AFT does require Java 8 or higher

Step 2

A configuration file named aft.cfg (in the etc subdirectory) must be provided by the user. The rest of this document deals with its construction.

Step 3

Start with:

cd bin
./run-aft.sh

on Windows systems, just double click the bin\run-aft.bat batch file, or manually execute it:

cd bin
run-aft.bat

In an interactive session AFT may be asked to provide its command line options:

java -cp aft-1.20pl71.jar com.americati.aft.Main -h

In order to stop AFT, just kill the running java or javaw process with the standard operating system tools.

Configuration

The aft.cfg file is used in order to configure the hubs and the agents. This is a plain text file with sections delimited by headers.

The hubs define a number of "programmed file transfers" by [hub-a2a:name], [hub-get:name] and [hub-put:name] sections; optionally a [hub] section may be used to set some global parameters. The transfers are implemented by the hub "controlling" the agents by means of a TCP socket; the connection is established from the hub to the agents, or from agents to the hub.

The agents are defined by a number of configured file transfers which start with an [agent-src:name] or an [agent-dst:name] section; optionally, an [agent] section may be present in order to set some common parameters;

Note that the configuration file may be modified at any time; it is reloaded about every minute, except when a transfer is in progress.

Agent Configuration

The agent configuration section does allow for an optional [agent] section. The following settings are provided:

agent-port

A mandatory setting for listening agents. Can be set to 'default' (without quotes) which means 18536.

max-children

An optional setting for the maximum number of simultaneous connected hubs. Defaults to 5.

Example:

[agent]
agent-port = 20111

Next, a number of "file transfers" are defined for the agent side, which allow the hub to extract or push files, from or into the agent host. Both cases correspond to [agent-src:name] and [agent-dst:name] sections respectively.

dir

The directory from which the files are to be extracted (in agent-src configurations) or into which the files will be written (in agent-dst configurations.)

Example:

[agent]
agent-port = 20111

[agent-src:bravo]
dir = /home/bravo/x-files
exec-after-transfer

A command to be executed via java’s ProcessBuilder after every file is transfered to/from the agent. The file name is added to the command. If the execution fails, it is logged out but the transfer is not stopped.

Warning: the AFT agent process may be blocked indefinitely if the command hangs.

The command must be an existing executable which will be executed via java’s ProcessBuilder with the file name as a single argument. No extra arguments are allowed for the command.

Agent in Client Mode

When the agents initiate the TCP connection (TCP client), the following settings must be set:

hub-host

The hub hostname or IP address.

hub-port

The hub listening port number.

hub-connect-check

When to connect to the source agent in order to extract pending files. It can take the form delay:# for a fixed time retry specifying a number of seconds, and cron:expr where expr is a crontab expression, none (don’t attempt to connect; the hub will initiate the connection.) Defaults to none, and once which means to attempt the connection, do the work and shutdown the AFT process.

The once mode definitions are activated when AFT is started with the -agent=name command line argument; else, they are ignored.

Note that the agents may initiate or wait for the TCP connections with the hub, but this is totally independent of the agent-src/agent-dst transfer modes.

Example:

# a single client mode agent transfer
# no [agent] nor agent-port needed at all

[agent-src:bravo]
dir = /home/bravo/x-files
hub-host = 192.1.4.51
hub-port = 6001
hub-connect-check = delay:3600

Hub Configuration

The hub configuration global section is optional, but is needed when the hub acts as a TCP server.

hub-port

Only if the hub will receive incoming connections from agents, a listening port must be set in the special [hub] section.

max-children

An optional setting for the maximum number of simultaneous connected "client" agents. Defaults to 100.

Example:

[hub]
hub-port = 6001

The hub configuration has a number of programmed file transfers. For "agent to agent" file transfers, a temporary storage directory is needed, which we call the "hub queue directory".

Example:

In the following configuration, my-transfer is a configuration which transfers files from the host 23.45.122.50 to the host 23.45.121.33 using a intermediate queue directory /home/aft/queue1 located in the hub host. Note that the default agent port is used for both cases (18536):

[hub-a2a:my-transfer]
src-host=23.45.122.50
queue-dir=/home/aft/queue1
dst-host=23.45.121.33

The following settings are provided:

src-host

Source agent hostname or IP address where the hub will attempt a connection in order to extract files. Valid for "A2A" and "GET" modes. If not present, then it is assumed that the source agent will initiate the connection.

src-port

Port number of the source agent in order to establish the communication. By default it is 18536.

dst-dir

Final location (directory) for the files in the destination host. Only valid for GET file transfers. AFT does not create such directory.

Example:

# a GET file transfer: extract files every 12 hours
[hub-get:t2]
src-host=fx8320.ana.com.uy
src-port=41232
src-connect-check=delay:43200
dst-dir=/tmp/hub-destination
dst-host

Destination agent hostname or IP address where the hub will attempt a connection in order to push files. Valid for "A2A" and "PUT" modes. If not present, then it is assumed that the destination agent will initiate the connection.

dst-port

Port number of the destination agent in order to establish the communication. By default it is 18536.

src-dir

Location (directory) of the files to be extracted. Only valid for PUT file transfers. AFT does not create such directory.

Example:

# a PUT file transfer
[hub-put:t1]
src-dir=/tmp/hub-dir
dst-host=fx8320
dst-port=55123
queue-dir

Directory in the hub computer for temporary storage of extracted files, before sending into the destination host. The directory must exist and allow the creation of files. AFT does not create such directory. The queue directories can’t be shared by two or more transfers. Only valid for "A2A" mode.

no-change-check-seconds

When a file is about to be read, AFT checks its modification time in order to prevent the transmission of an uncompleted file which is being written. This setting configures the "antiquity" (in seconds) the file must have in order to reasonably guarantee the termination of its writing process. when the file is "too new" then another attempt will be done after a number of seconds as defined in the op-timeout-seconds setting. A zero setting totally avoids this check. Only valid for "PUT" mode.

no-change-retry-times

How many times to retry the extraction if a file is too new as per no-change-check-seconds.

src-connect-check

When to connect to the source agent in order to extract pending files. It can take the form delay:# for a fixed time retry specifying a number of seconds, and cron:expr where expr is a crontab expression, or none (don’t attempt to connect; agent will initiate the connection.) If no src-host is defined, defaults to none; is src-host is defined, defaults to delay:900 which means an extraction attempt every 15 minutes. The cron expression follows the semantics of the Spring Framework.

files

What files are to be extracted and transfered. Valid formats are all, list:name,…​, ereg:expr, and tereg:expr; defaults to all (all the files in the directory.) The list mode allows a comma separated list of exact file names to be specified; the ereg mode is used to specify a regular expression pattern to match the interesting file names. Finally, the tereg is a two step process where the subexpressions enclosed between %…​% are used as java’s SimpleDateFormat formatters with the current time (in order to build a time generated pattern) and then the result is used as a regular expression (like the ereg case.) Two consecutive percent signs are used to generate a single percent sign.

Example:

# only extract the three specified files
[hub-get:t2]
src-host=fx8320.ana.com.uy
src-port=41232
dst-dir=/tmp/hub-destination
files=list:file2.txt,file3.txt,file4.txt

The same result may be obtained with:

# only extract the three specified files
[hub-get:t2]
src-host=fx8320.ana.com.uy
src-port=41232
dst-dir=/tmp/hub-destination
files=ereg:file[234]\.txt

A time-related expression using the treg: format:

# only extract the files for today
[hub-get:t2]
src-host=fx8320.ana.com.uy
src-port=41232
dst-dir=/tmp/hub-destination
files=tereg:%yyyyMMdd%\.txt

The list:, ereg: and tereg: forms also admit a negated form with the corresponding !list:, !ereg: and !tereg: prefixes. For example:

# extract any YYYYMMDD.txt file except the today's one
[hub-get:t2]
src-host=fx8320.ana.com.uy
src-port=41232
dst-dir=/tmp/hub-destination
files=!tereg:%yyyyMMdd%\.txt
dst-connect-check

When to check out for pushing files into a destination agent. See src-connect-check for the syntax and default value.

dst-connect-after-transfer

Whether to try to connect to the destination agent immediately after the extraction from the source agent, additionally to the programmed transfer regime of dst-connect-check. Set to true or false. Defaults to false.

src-exec-after-transfer

After the successful extraction of a file from the source agent, a user-defined program may be executed specifying an executable name with this setting. The just extracted file name is passed as an argument to this program. The program is executed in the hub host. If the execution fails, it is logged out but the transfer is not stopped. This setting is valid for "A2A" and "GET" modes.

dst-exec-after-transfer

After the successful delivery of a file in its destination agent, a user-defined program may be executed specifying an executable name with this setting. The just transfered file name is passed as an argument to this program. The program is executed in the hub host. If the execution fails, it is logged but the transfer is not stopped. This setting is valid for "A2A" and "PUT" modes.

cleanup-mode

What to do with the source file after successful transfer. Valid options are remove, truncate, and none; defaults to remove.

Note: this action is carried on after any configured command execution.

Warning: the default remove may be unexpected to some (most?) users. The rationale is that the information is not lost at all since the transfer was effectively done. Be careful!

compress-mode

Compression operation mode. Valid settings are:

  1. none to avoid any compression

  2. network compress data in-transit in order to reduce network traffic

  3. gzip compress data in-transit and stores the files in Gzip compressed format

Defaults to network. The none mode is useful when transmiting already compressed files or any kind of non-compressible files (like encrypted ones.) It avoids the CPU consumption required by the compression/decompression operations.

write-mode

Valid settings are simple and tmp. The simple mode just opens the file for writing into its corresponding file name and writes as the data is being transfered. The tmp mode opens a temporary file (in the same destination directory) and only when all the data is transfered, renames the temporary to the final file name. Defaults to simple.

op-timeout-seconds

The timeout for connection setup and reply arrival. Defaults to 15 seconds. In very congested networks this could be increased.

When a transfer will be initiated by an agent, the src/dst host and port settings must not be set. Also, the src/dst connection check setting must be set to 'none`. For example, if the source is to be extracted from an agent which will be connected to the hub, then sent to an agent which awaits for the hub connection, the following configuration is in order:

# extract from "client" agent, push to "server" agent
[hub]
hub-port = 5122

[hub-a2a:the-transfer]
src-connect-check=none
queue-dir=/home/aft/queue4
dst-host=fx8320
dst-port=55123
# try to send from the hour 9AM to 5PM only working weekdays
dst-connect-check=cron:0 0 9-17 * * MON-FRI

TLS Configuration

By default, AFT uses a built-in self-signed certificate for the file transfers, which provides data encryption but does not prevent unauthorized parties (also having AFT) to interact with the participating nodes.

AFT allows the 'mutual authentication' of the interconnections relying on digital certificates which may be created by external tools or the built-in wizart aft-cert provided in the AFT distribution.

The wizard allows the creation of a self-signed root certificate authority (CA) in order to issue the per-node certificates. The root certificate file must be transfered to the deployed nodes and be referenced with the tls-root-cert setting in the [tls] section.

Each node must have a TLS certificate and its corresponding private key files, referenced by the tls-node-cert and tls-node-cert settings of the same section. A typical configuration looks like:

[tls]
tls-root-cert=etc/root-ca.crt
tls-node-cert=etc/NODE1.crt
tls-node-pk=etc/NODE1.key

Note that the private key must not be encrypted (else a password would be necessary in the configuration file which defeats the original purpose.) The private key file must be protected with the operating system permissions.

The parties may be configured to authenticate the peer node of the interconnection (inbound or outbound) by specifying the peer’s certificate common name (CN.) The tls-peer setting is used for that purpose.

Note that the peer authentication is optional, but very recommended.

For nodes with transfers in TCP-client mode, the peer authentication must be configured by transfer (agent or hub.) For example:

# the peer (agent) node must present a valid certificate with CN=FX8320
[hub-get:t2]
src-host=fx8320.ana.com.uy
src-port=41232
dst-dir=/tmp/hub-destination
files=ereg:file[234]\.txt
tls-peer=FX8320

For nodes with transfers in TCP-server mode, the peer authentication must be configured besides the server port setting. For example:

# the peer (agent) node must present a valid certificate with CN=NODE-777
[hub]
hub-port = 6001
tls-peer=NODE-777

That is, the peers can’t be configured per-transfer. The reason is that the TLS handshake and authentication happens before the identification of the actual file transfer.

Manual Encryption Configuration

The agents may have a (unique) encryption key for their file transfers. Hubs may have per-transfer encryption keys.

In any case AFT does support the following encryption settings. Note that only one may be selected (else it would be a contradiction.)

crypto-key

For triple-DES, use the prefix desede: and 48-hexadecimal characters representing 24 bytes (three blocks of 8 bytes.) The former two blocks correspond to the triple-DES EDE keys; the last block is the initialization vector.

For AES, use the prefix aes: followed by an hexadecimal representation of an AES key and a 12 bytes initialization vector. For AES 128, 192 and 256, provide 16+12=28, 24+12=36 or 32+12=44 bytes corresponding to 56, 72 and 88 characters.

crypto-seed

Derive a key from a simple text string (like a password.) For triple-DES use the prefix desede:, and for AES use aes:.

Example:

[agent]
agent-port = 5123
crypto-seed = aes:r5Fq12p9cw

The encryption settings can’t be set per agent-transfer. Bad example follows:

[agent]
agent-port = 20111

# Wrong!
[agent-src:bravo]
dir = /home/bravo/x-files
crypto-seed = aes:r5Fq12p9cw

And for every hub transfer:

[hub-a2a:liz]
crypto-seed = aes:r5Fq12p9cw
src-connect-check=none
queue-dir=/home/aft/queuez
dst-host=fx8320
dst-port=55123
dst-connect-check=delay:7200

[hub-a2a:cla]
crypto-seed = desede:bnlfaU1fh
src-connect-check=none
queue-dir=/home/aft/queuec
dst-host=192.4.112.41
dst-port=13551
dst-connect-check=delay:7200

But not for the hub section:

# Wrong!
[hub]
hub-port = 20111
crypto-seed = desede:bnlfaU1fh

For AES defaults to 128 bits for the key length; add a numeric prefix to specify 192 or 256 bits:

crypto-seed=aes:256:vzx938rn2he2
crypto-store

A prefix and a file name for a key storage.

The key storage file is created by AFT with the command line options -gen-des-store (triple-DES) and -gen-aes-store-128, -gen-aes-store-192 or -gen-aes-store-256 for AES.

Note that this method uses the JCEKS keystore algorithm. This method avoids the setting of keys and/or passwords in configuration files, which is a good security practice. An interactive session is needed in order to generate the keystore:

java -cp aft-1.20pl71.jar com.americati.aft.Main -gen-des-store=store.key

The key store file (store.keys in the example) needs to be transfered by some mechanism to the remote host. Then, both peers may use it:

[agent]
agent-port = 5123
crypto-store = desede:store.keys

and:

[hub-a2a:liz]
crypto-store = desede:store.keys
src-connect-check=none
queue-dir=/home/aft/queuez
dst-host=fx8320
dst-port=55123
dst-connect-check=delay:7200