Import of big sets to MySQL -
i'm trying import wikipedia access logs(http://dumps.wikimedia.org/other/pagecounts-raw/) mysql internal use.
target: draw day/url graph
wikipedia file name contains day , hour , has following structure:
language url visits size_of_answer
my current db structure:
table urls: url_id;url_string;language indexes on table visits: visits_id;url_id;day_stamp;visits_count indexes on table temp: visits_id;url_id;url_string;language;visits_count;day_stamp
mysql engine:innodb
current method:
- i filter raw logs interesting me. filtered file contains ca 250k lines , has structure of "temp" table.
- importing file "temp" load data infile.
- set url_id in "temp" existing urls(update... join temp urls on url_strings).
- insert urls "temp" "urls" temp.url_id=0
- set again url_id in "temp" existing urls(update... join temp urls on url_strings).
- set visits_id in "temp" existing visits lines(update... join temp visits on url_id).
- insert visits "temp" visits_id=0
- update visits "temp" visits_id!=0
all takes 5 minutes import.
is here way faster? steps, db?
Comments
Post a Comment