The script below will import the betfair data from data.betfair.com into a mySQL table. First create the table,
CREATE TABLE `history` (
`sports_id` int(11) DEFAULT NULL,
`event_id` int(11) DEFAULT NULL,
`country` varchar(10) DEFAULT NULL,
`scheduled_off` varchar(30) DEFAULT NULL,
`actual_off` varchar(30) DEFAULT NULL,
`full_description` varchar(150) DEFAULT NULL,
`odds` double DEFAULT NULL,
`settled_date` varchar(30) DEFAULT NULL,
`first_taken` varchar(30) DEFAULT NULL,
`latest_taken` varchar(30) DEFAULT NULL,
`course` varchar(150) DEFAULT NULL,
`volume_matched` double DEFAULT NULL,
`number_bets` double DEFAULT NULL,
`event` varchar(50) DEFAULT NULL,
`in_play` varchar(2) DEFAULT NULL,
`selection` varchar(50) DEFAULT NULL,
`selection_id` int(11) DEFAULT NULL,
`win_flag` varchar(1) DEFAULT NULL,
KEY `sports_id` (`sports_id`),
KEY `event_id` (`event_id`),
KEY `odds` (`odds`),
KEY `win_flag` (`win_flag`),
KEY `descr` (`full_description`)
) ENGINE=InnoDB DEFAULT
Then use the following script,
#!/usr/bin/python
import csv
import MySQLdb
import re
import os
path=”/path/to/files”
db = MySQLdb.connect(host=’localhost’,
user=’user’,
passwd=’pass’,
db=’betfair’)
cursor = db.cursor()
files = os.listdir(path)
for filename in files:
csv_data = csv.reader(file(path+”/”+filename))
for row in csv_data:
cursor.execute(“INSERT into history(sports_id, event_id, settled_date, country, full_description, course, scheduled_off, event, actual_off, selection_id, selection, odds, number_bets, volume_matched, latest_taken, first_taken, win_flag, in_play) values (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)”, row)
db.commit()
cursor.close()
Adjust the path at the to to the location of the raw csv(uncompressed) files that you would like loaded into mySQL.
Edit the database information
Run the script
You can also add filters, for example when processing horse racing data I wanted only the following,
- UK races
- Bets placed before the event (IE not in running)
- Exclude hurdles
This is what I used
for row in csv_data:
if row[1] != “EVENT_ID” and row[3] == “GB” and row[17] == “PE” and “PLACED” not in row[7] and “Hrd” not in row[7]:
if re.search(‘[0-9]f’, row[7]):
cursor.execute(“INSERT into history(sports_id, event_id, settled_date, country, full_description, course, scheduled_off, event, actual_off, selection_id, selection, odds, number_bets, volume_matched, latest_taken, first_taken, win_flag, in_play) values (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)”, row)