ruby on rails - How should I import this data into my database? -
i have database thousands of records
code | name | price 00106 | water | 9.99 00107 | onion | 8.99
which coded in ges
file below:
00f
means column header00i
means insert row
there others like(00d
delete row or 00u
update)
00f 0101 02code 031 00f 0102 02name 031 00f 0103 02price 030 00i 0100106 02water 030999 00i 0100107 02onion 030899
i want create importer process file , push database. started implemented that:
class importer conn = activerecord::base.connection f = "00f" = "00i" def extract_to_database(collection) add = true tmp = [] type = f inserts = [] collection.each_with_index |line, i| _type = line.strip _changed = [f,i].include? _type if _changed && > 0 case type when f @f << tmp when group_id = group.find_by(code: tmp[1]).id inserts.push "(group_id,'#{tmp[2]}','#{tmp[3]}')" end tmp = [] type = _type end tmp << line end sql = "insert products (`group_id`, `name`, `price`) values #{inserts.join(", ")}" conn.execute sql end end
there 1 problem that, refactor using functional programming.
and have find other model code
, put products
table related some_model_id
column can complicate whole process. because right importing data takes me few hours.
maybe using ruby not best option.
there's nothing here ruby can't handle. it's not clear how "functional programming" any, either, classic state-machine sort of problem simple data transformation going on.
example scaffold:
class somethingimporter field_marker = "00f" insert_marker = "00i" columns = %w[ group_id name price ] # performs insert given model. should class # method on model itself. def bulk_insert(model, rows) sql = [ "insert `#{model.table_name}` (#{columns.collect { |c| }}" ] # append placeholders: (?,?,?),(?,?,?),... sql[0] += ([ '(%s)' % ([ '?' ] * columns.length).join(',') ] * rows.length).join(',') sql += rows.flatten model.connection.execute(model.send(:sanitize_sql, sql)) end # resolve group code group_id value, , cache result # subsequent look-ups same code valid. def group_id(group_code) @find_group ||= { } # tests if value has been cached code, including 1 # might nil. if (@find_group.key?(group_code)) return @find_group[group_code] end group = group.find_by(code: group_code) @find_group[group_code] = group && group.id end # call actual collection, lines stripped, , header # lines removed (e.g. collection.shift) def extract_rows(collection) state = nil rows = [ ] row = [ ] collection.each_with_index |line| case (line) when field_marker # indicates field data follow state = :field when insert_marker case (state) when :insert rows << [ row[0], row[1], (row[2].sub(/^0+/, '').to_f / 100) ] end state = :insert row = [ ] else case (state) when :field # presumably you'd pay attention data here , establish # mapping table. when :insert row << line.sub(/^\d\d/, '') # puts row.inspect end end end case (state) when :insert rows << [ row[0], row[1], (row[2].sub(/^0+/, '').to_f / 100) ] end rows end end data = <<end 00f 0101 02code 031 00f 0102 02name 031 00f 0103 02price 030 00i 0100106 02water 030999 00i 0100107 02onion 030899 end importer = somethingimporter.new puts importer.extract_rows(data.split(/\n/)).inspect
the example output this, based on data, looks like:
[["00106", "water", 9.99], ["00107", "onion", 8.99]]
when writing code this, sure expose intermediate results in order able test what's happening. implementation takes data , dumps directly in database in 1 shot, making hard tell something's going wrong if doesn't work out properly. version composed of several methods, each of has more specific purpose.
it's not clear in original example why you're resolving group_id
@ all, sample output has nothing that, example i've included method resolves them , keeps them cached, avoiding repeated lookups of same thing. larger scale import you'd load in many rows, extract out distinct group_id values, load them @ once, , remap them before inserting.
Comments
Post a Comment