The TechFell protocol is a very simple lightweight protocol for making
connections to SQL databases. It has primarily been aimed at SQLite
(www.sqlite.org), allthouth nothing much excludes it's application
with other database servers. It's primary advantages are:
Very small and simple. Emphasis is on traditional proven
techniques.
Easy to implement into obscure environments where standard
database drivers/layers are not available (embedded systems,
application scripting languages etc.).
May be incorporated into web controls (Flash, ActiveX, .Net ecc)
without specific server backends.
Possibility of using the protocol from text terminals or similar
with no software.
May be used in 'machine' mode to get more compact result sets
that are easier to parse than the 'human' (direct display) mode.
May be used from environments where data streams cannot contain
control characters (see :PPRAGMA ETX for limitations)
Highly portable.
Simple but effective access system allows reasonably secure
deployments on WANS/internet.
Possibility of elaborating incoming data 'on the fly' to minimize
memory requirements.
No external dependencies (db drivers, ODBC subsystems etc.).
Designed for TCP/IP but may be used on any stream, for example
RS485
With a suitable server may be used for inter process / inter
system communications.
Despite the above advantages it is also fast and capable of
dealing with large datasets.
How it works
The principle is very simple, the client sends a string with a query,
and the server with data, or an error message, or just a simple 'OK'
for a query that does not produce a result set.
The (optional) HUMAN mode sends reply data as readable text:
The more conventional (and default) mode is MACHINE mode. In this mode
it is assumed that the data is to be intercepted by code, so it is made
a bit more machine readable and compact:
As can be seen, in this mode the column headers are sent at the start
of the result which is followed by the fields. Of course the
query is not limited to data lookups, you may carry out other SQL
operations or get database schema:
update tagvals set
Val=4 where Idx=2
:OK
select A,B from tagvals
:Err : SQL error : no such column: A
:OK
You have probably grasped the idea by now. All that is needed is a few
rules.
Access control
SQLite is designed as an embedded database and does not implement
permissions (GRANT and REVOKE), or any other kind of security,
allthougth hooks are present which could allow an eventual
implementation. Standard SQL access control tends to be quite complex
and fine grained, and whilst being essential for many traditional SQL
database applications it is unnecessarily complicated for the type of
applications that the TechFell protocol is designed for.
On the other hand, some level of protection is essential if a database
is to be used on open networks or the internet, so the TechFell
protocol implements a simple protection scheme that mimics access
control on filesystems. It is supplied as a complimentary technique to
normal SQL security, not as a substitute. Each connection to the server
is assigned an access level, or uselevel, which determines the level of
access which is allowed. It is made up of a bitfield with the following
weights:
So a user with level 1 can only do
selects.
A user with level 7 can Select, Update
& Insert
A user with level 12 can only Insert
and Delete
A user with level 31 is a power user
It is up to the server implementation to decide how userlevels are
defined, but they would typically be dependent on both the User name
and the network address from which they are connecting.
:PPRAGMA commands
:PPRAGMA commands are pragma commands intended for the database wrapper
rather than the database proper ( PrePragma). The wrapper handles the
command and generates a :PPRAGMA reply. These commands are specific to
the wrapper, they control access and change the configuration.
NOTE: Case is important in :PPRAGMA commands and the server need not
respect any line editing commands (backspace etc.) Frustrating, but
:PPRAGMA's are little used and the TechFell protocol is designed to
make servers as easy as possible, not just the clients :-)
:PPRAGMA USER and :PPRAGMA PASS / :PPRAGMA NEWPASS
Logs in a user:
:PPRAGMA USER Level5 :PPRAGMA USER Level5 :OK
:PPRAGMA PASS foo :PPRAGMA USELEVEL 0 :OK
In this example the user logs in as 'Level5' (an arbitrary
name), and the server confirms. The user then gives the password 'foo'.
The server replies with the access level obtained. The user may carry
on issuing new user and pass commands until a desired level is
achieved, but a secure server would of course close the socket
after a few failed attempts, or slow down the response. In this case
the user has re-attempted with a blank password. In order to give a
blank password it is necessary to leave a single <spc> after the
word PASS. After entering with a blank password the user uses NEWPASS
to set a password to 'newp', and then logs in again to check that newp
works. The users then changes the password again to 'newp2'. As the
password is already set it is necessary to give the old password as
the second argument.
:PPRAGMA ETX
The ETX character (ASCII 3) is normally used to terminate
'lines', this means that text fields may contain text formatting
control characters such as CR, LF and TAB as they have no significance
to the protocol. However, as some environments do not allow the
insertion of any control characters in a data stream, the server will
always start in 'newline' mode, that is it will accept any combination
of ETX, CR and LF as a terminator, and will issue CR as a terminator on
the lines it sends back. The :PPRAGMA ETX command requests the server
to switch to ETX. This would normally be done straight after
connecting. Once a client has received a line terminating with an ETX
rather than a CR or LF, it will switch to ETX mode itself and will only
use ETX both on reception and transmission. ETX is the normal mode but
cannot be set as default, so applications should issue the :PPRAGMA ETX
command as sson as they connect to the database.
:PPRAGMA HUMAN and :PPRAGMA MACHINE
The role of these commands is explained in the 'How ir works' section.
The server will always start in MACHINE MODE by default, you may switch
back and forth between HUMAN and MACHINE modes as often as you
like. Note that whilst MACHINE mode is canonical, HUMAN mode is simply
defined as being 'easy to read', the format may vary.
:PPRAGMA ID
:PPRAGMA ID returns a unique ID number for the connection. This should
be unique for the entire server lifetime, not just the client. It is
useful for creating exclusive names in common namespaces.
Recordsets and replies
If the query does not require any data sets to be returned it will
simply reply ':OK' to acknowledge the request.
If data is to be returned then the general format is as
follows: (see also the examples in the 'How it works'
section)
Header section (column header information)
':R' seperator
Field data (sequential field contents sent in
column/row order)
Header records are sent in the following format:
:H5:8 WriteVal
the first value after the :H is the column number. The second value is
the size of the column name (that is the text that follows it, not the
column width).
The list of column headers is terminated with an ':R'. Following the
seperator the field values are sent in column x row order, generally
with no prefix. There is an entry for each field, NULL fields are sent
as an exclamation '!'.
The exception to this format occurs when the field value starts with a
colon ':' or an exclamation '!' or when the field contents are more
than 32 characters. In these cases a ':Fn ' prefix is used:
:F42 !This non null field starts with an ! mark
:F1 !
:F3 :::
The value following the F indicates the number of characters in the
data field. A single <spc> character seperates the prefix from
the start of the data field.
Error Reporting
Errors are reported in line as they occur by means of :Err lines:
select Fullnames,Idx from tagvals
:Err : SQL error : no such column: Fullnames
:OK
The ':Err :' prefix is
followed by a text string describing the error which may have any
format. It is important to note that an :Err line may occur anywhere
between the query and the final :OK
In particular, errors may occur in the field data section, for example:
Needless to say, in the case of grave errors or anomalies the server
will simply close the connection.
Data Representation
SQLite treats all data as ASCII strings (more recent version also have
Unicode or wide string support, the TechFell protocol does not support
this). Column types in SQLite are used to indicate what type of data
the column holds, but it does not enforce this. The TechFell protocol
embraces this approach and also treats all field values as strings, it
does not communicate the field type, but this information is available
by means of pragma commands to get table schema. This is OK for
embedded applications as the apps should know what they are dealing
with. For the TechFell protocol the following conventions are
recommended:
Numbers
Integers should be represented 'as-is' in decimal. The 0x prefix
is recommended for hex numbers but decimal forms are preffered.
Real values should always include a decimal point and 0, even if
there is no fraction.
Real values may be expressed in exponential form, 3.0E8 (3E8 is
not acceptable)
Booleans should be expressed as 0 or 1 where 0 is false.
Decimal point is always a '.' locales that use a ',' should be
converted prior to insertion.
Plus sign '+' is optional. Minus '-' prefixes numbers and
mantissas and is optionally placed immediately after the E for
exponents (3.0E-8)
Text
Text may contain text formatting control characters such as CR,
LF, TAB, HT & VT and BS. ESC characters are allowed.
CR and LF may not be used in any column that is to be read by a
non ETX device (these are rare, in the case of HUMAN mode on text
terminals, CR and LF are OK as they just show up 'as is')
Text should avoid using other ctrl characters, in particular
NULL, ETX, DC1 (XON) ,DC2 (XOFF),STX,EOT, ETB and SOH as these conflict
with TechFell control characters or underlying transports.
Extended ASCII character sets (128->255) may be used.
Dates and Times
TIME(8) = YYYYMMDD
TIME(14) = YYYYMMDDHHmmSS
TIME(14+) = YYYYMMDDHHmmSStt[hh][kk]
TIME(6) = HHmmSS
TIME(10) = HHmmSStthh
TIME(12) = HHmmSStthhkk
TIME(3) = WWw
TIME(5) = YYWWw
tt=tenths of seconds, hh=hundreths,
kk=thousandths
WW = weeknumber, w=Day of week where Monday = 1, Sunday = 7.
Note that by using these reccomendations it is possible to
automagically determine the format by establishing the length. Types of
length 6, 8 and 14 are the prefferred type, in particular
TIMESTAMP(14) is a widely used standard.
Binary data
Binary data should be base64 encoded and be prefixed
base64<spc>. A mime type or other meta data may be expressed
before the data using the following forms:
base64 Data
base64 mime Data
base64 meta_no_space Data
Needless to say, meta data that looks like a mime
type (jpeg, gif etc) may have spectacular results!
Data Transports (Multi drop RS485 etc.)
The TechFell protocol was primarily developed for TCP/IP networks, but
may be used on other data streams. On Asyncronous serial point to point
links (RS232, modems etc) it may be used as-is, but it is prefereable
to use PPP if available as this adds error detection and control.
Failing that, serial links may use the following packeting convention,
which is also appropriate for RS485 multi-drop networks and radio links:
An SOH is followed by an address byte composed of 3 ASCII HEX digits
which is then repeated. An <STX> marks the start of
TechFell protocol data. In the case of a recordset reply this may
contain more than one line seperated by the normal <ETX>
delimeter. The end of the blockis signalled by the EOT character. The
CheckSum is 4 ASCII HEX digits terminated by an <ETB>, it is the
modulus 16 sum of the ASCII values of all the bytes in the <TechFell
protocol data>. Obviously severe limits should be imposed on
the size of the recordsets in multi-drop scenarios.
Any node on a multi-drop network may be a client, a server, or both.
However
one node must serve the role of 'Master'. The master fingers each slave
node in turn with the following packet:
<SOH><ADD><ADD><ETB>
Upon reception, the slave may reply to the master with a similar packet
to indicate that it has nothing to do, or emit 1 packet. The packet may
contain a query or a reply to any other node on the network, but the
recipient may not reply until it is fingered by the master. If a slave
recieves a packet with a framing error (for example the 2 <ADD>
values don't match or there was no <EOT>, or a CheckSum error, it
does nothing.
A single node may perform the role of master, client slave,
server slave, or even implement several clients and servers, but each
one must have it's own network address, and the 'master' must finger
it's own clients 'over the network'.
Timeouts for slave responses may be configured on the master on a node
by node basis. The master may implement non sequential polling in order
to give priority to certain nodes or implement other strategies (for
example, as clients and servers have different adresses, the master can
ensure that servers get fingered following a request emitted from a
client ). The master may also 'degrade' slaves who do not reply.
Slave Clients should be configured to expect a reply from a server
within 'n' fingers from the master, and not on the basis of timeouts.
(hence masters must track the number of requests that a server has
received and finger them accordingly). A server that cannot respond to
the query immediately may send a reply to the client with a single
<ACK> character in place of the <TechFell protocol data> in
order to signal to the client that the request is "pending", and on
reception the client resets it's 'n' counter. It may do
this indefinitly but if it receives a successive request from the same
client it must discard the pending request (requests may not be queued
up, and a client should make requests on a 1 by 1 basis).
Legal stuff.
This document is copyright Roger Irwin 2005. You may read it, laugh at
it, and maybe even deploy it. It is up to the user to establish the
suitability of the protocol for any particular purpose. No liability
whatsoever is accepted.
You may use this protocol as you wish. I consider it to be in the
public domain. The technology and techniques described here have been
in use for at least 20 years and represent industry standard practices.
AFAIK There is nothing patentable herin, unless somebody has slapped a
patent on "using exclamative symbols for representing lack of
data in a field", which would be an interesting patent as it would also
cover the expression "someone has pinched the f******* instruction
manual from the tractor".
Have fun and be careful, there are some pretty nasty bytes out there.