This page describes an in-progress
"DAML-in-the-Small"
application I'm writing to reconcile
my personal travel and other expenses.
I've long wanted to implement something like this,
but hadn't previously had sufficient motivation or tools.
DAML worked quite well.
While it's unlikely that others will be able to use most of this
software directly,
the approach and lessons learned should be more widely applicable.
Background
I travel quite a bit
(20-30 business trips a year for the past 8 years)
and have to file an expense report for each trip.
When my monthly Corporate American Express bill arrives,
I spot check it, but
don't carefully cross-check it with my trip reports.
If a hotel added a few dollar charge after checkout,
or a rental car was double billed in 2 consecutive months,
I probably wouldn't catch it.
Based on the errors I have occasionally found,
I expect more may be lurking.
Data Sources
I started with 5 different data sources:
- my monthly Corporate American Express credit card statements,
which are now downloadable as comma-separated-value (.csv)
files from
www.americanexpress.com
- my checking account register, which I
maintain using
CheckFree
software on an archaic Mac.
Checkfree maintains its
database in a format that
(surprisingly) seems to be compatible with
Microsoft dBase
(when copied onto my Windows 2000 laptop).
- my corporate trip reports, which are
submitted as Excel spreadsheets,
along with a separate "hotel balance"
spreadsheet that I normally
prepare to make sure all hotel expenses are
allocated
- my corporate non-travel expense reports, which are prepared
using a proprietary EECR application that stores its
data in text files.
- my Hilton HHonors
frequent guest information, which is available from
www.hhonors.com
in a "printable" HTML format easily parsed by
WebL
I developed a DAML ontology and conversion program(s) for each of these
data sources.
These are linked from the table below,
which also contains some statistics on my current data
(which is not being made available).
Reconciliation
I decided to use
JESS
rules to implement the matching between expenses.
The current version is
match.clp.
The DAML is converted into an input file containing JESS
facts using a
genfacts.java
program that I'd
previously developed,
and loaded into JESS.
In the future, I'd like to load DAML statements directly through the JESS API.
Each DAML statement is represented as a single JESS fact
(it would be interesting to compare the performance of this approach to
representing known schemas as unordered facts).
After the rules and facts are loaded into JESS,
the rules are fired.
Several JESS queries are then performed to identify any unmatched
transactions.
Findings
The application takes about 200 seconds to run on my 850 MHz laptop.
I haven't made any significant attempts to optimize this.
Almost all of the time is spent loading the facts.
The rules fire in a few seconds,
and currently generate 422 matches.
Memory usage grows to 200 MB.
Obviously, it would be nice to get more of the original information
(hotel check-out statements, credit card statements, etc.)
directly in DAML form.
The semantics of some of the current sources is murky.
For example, our corporate trip report format
has no way to distinguish parking charges that are included in a
hotel bill from those that are paid separately in cash or by credit card.
It would be nice to have a
graph filtering tool
to extract only the transactions with category #BBN
from my generated check.daml file before
importing them into JESS.
I haven't yet identified any inconsistent charges or payments,
but plan to keep looking!
Possible Future Directions
- add more data sources (e.g.
United MileagePlus)
- investigate use of the JESS support in GRCI's
DAML API
- adopt more formal accounting principles
- ...
Author
Mike Dean
$Id: index.xml,v 1.11 2002/04/11 14:14:38 kmbarber Exp $