You are herePre-processing your .QIF for GnuCash with python - use regex and a flat DB to add categories to transactions

Pre-processing your .QIF for GnuCash with python - use regex and a flat DB to add categories to transactions


By dgtlmoon - Posted on 21 November 2006

GnuCash is an excellent linux based accounting package, it is modelled very closely to traditional two-column accounting methods but has lots of functionality like generating invoices, tracking clients and all the other things you'de expect from a package to run your finances.

One feature I really like about it is it's ability to import .QIF files from my bank, however it's a bit of a pain to to tell gnucash for every transaction what account/category it belongs to, ie - is it a withdrawl? a fee? etc, also bank's place serial numbers into the transactions for tracking, which means you have extra work todo in GnuCash to get each transaction into the right account.

An example transaction in a .QIF
^
D26/09/2005
T-50.00
MWITHDRAWAL ATM - OTHER 23423342342 IGA
^
When passed thru the python-regex-db engine it should appear as
..
^
D26/09/2005
T-50.00
MWITHDRAWAL ATM - OTHER 23423342342 IGA
LWDL
^
note - the added category, this is what gnucash will link to an internal account tree/branch

Using regular-expressions we can match the Memo field (^M) to a regular expression key:value which we have stored in a DB using berkley-db-keys and get it to place a Category (^L) with the memo, which GnuCash can use to link the category to an account.

heres the script i cludged in python, run it something like "./qif-filter.py bank-statement.qif" it will ask you for the right regex or supply a previous one for a matching memo record, when it is happy it will automatically search and add categories to your transactions and then spit the .QIF out via STDOUT

#!/usr/bin/python

import gdbm
import os
import re
import sys

db = gdbm.open("dbm", "c")

def choose_category():
    i=0

    for key in db.keys():
        i=i+1

        print str(i)+".) "+str(db[key])+"\t - "+key


    return "test"

def add_key(memo):
    global db
    choose_category()
    print "Whats your regex for "+str(memo)
    r=raw_input("regex: ")
    c=raw_input("category: ")
    db[r]=c
    db.close()
    db = gdbm.open("dbm", "c")
    memo_to_category(memo)

# see which regex we match
# return the category name
def memo_to_category(memo):
    global db

    for key in db.keys():
        p=re.compile(key)
        if p.search(memo):
            return "L"+db[key]

    # if you got to here then we have nothing
    add_key(memo)
    return -1


memoline=re.compile("^M")
lline=re.compile("^L")

if os.path.exists(sys.argv[1]):
    input = open(sys.argv[1], 'r')
    for line in input:
        if memoline.search(line):
            cat=memo_to_category(line)
            print line.rstrip()
            if cat != -1:
                print cat

        else:
            if lline.search(line) == None:
                print line.rstrip()


db.close()
Your rating: None