ruby on rails - How should I import this data into my database? -


i have database thousands of records

code  | name  | price 00106 | water | 9.99 00107 | onion | 8.99 

which coded in ges file below:

  • 00f means column header
  • 00i means insert row

there others like(00d delete row or 00u update)

00f 0101 02code 031 00f 0102 02name 031 00f 0103 02price 030 00i 0100106 02water 030999 00i 0100107 02onion 030899 

i want create importer process file , push database. started implemented that:

class importer   conn = activerecord::base.connection   f = "00f"   = "00i"    def extract_to_database(collection)     add       = true     tmp       = []     type      = f     inserts   = []      collection.each_with_index |line, i|       _type    = line.strip       _changed = [f,i].include? _type        if _changed && > 0         case type         when f @f << tmp         when           group_id = group.find_by(code: tmp[1]).id           inserts.push "(group_id,'#{tmp[2]}','#{tmp[3]}')"         end          tmp  = []         type = _type       end        tmp << line     end     sql = "insert products (`group_id`, `name`, `price`) values #{inserts.join(", ")}"     conn.execute sql   end end 

there 1 problem that, refactor using functional programming.

and have find other model code , put products table related some_model_id column can complicate whole process. because right importing data takes me few hours.

maybe using ruby not best option.

there's nothing here ruby can't handle. it's not clear how "functional programming" any, either, classic state-machine sort of problem simple data transformation going on.

example scaffold:

class somethingimporter   field_marker = "00f"   insert_marker = "00i"    columns = %w[ group_id name price ]    # performs insert given model. should class   # method on model itself.   def bulk_insert(model, rows)     sql = [       "insert `#{model.table_name}` (#{columns.collect { |c| }}"     ]      # append placeholders: (?,?,?),(?,?,?),...     sql[0] += ([ '(%s)' % ([ '?' ] * columns.length).join(',') ] * rows.length).join(',')      sql += rows.flatten      model.connection.execute(model.send(:sanitize_sql, sql))   end    # resolve group code group_id value, , cache result   # subsequent look-ups same code valid.   def group_id(group_code)     @find_group ||= { }      # tests if value has been cached code, including 1     # might nil.     if (@find_group.key?(group_code))       return @find_group[group_code]     end      group = group.find_by(code: group_code)      @find_group[group_code] = group && group.id   end    # call actual collection, lines stripped, , header   # lines removed (e.g. collection.shift)   def extract_rows(collection)     state = nil     rows = [ ]     row = [ ]      collection.each_with_index |line|       case (line)       when field_marker         # indicates field data follow         state = :field       when insert_marker         case (state)         when :insert           rows << [ row[0], row[1], (row[2].sub(/^0+/, '').to_f / 100) ]         end          state = :insert         row = [ ]       else         case (state)         when :field           # presumably you'd pay attention data here , establish           # mapping table.         when :insert           row << line.sub(/^\d\d/, '')           # puts row.inspect         end       end     end      case (state)     when :insert       rows << [ row[0], row[1], (row[2].sub(/^0+/, '').to_f / 100) ]     end      rows   end end   data = <<end 00f 0101 02code 031 00f 0102 02name 031 00f 0103 02price 030 00i 0100106 02water 030999 00i 0100107 02onion 030899 end  importer = somethingimporter.new  puts importer.extract_rows(data.split(/\n/)).inspect 

the example output this, based on data, looks like:

[["00106", "water", 9.99], ["00107", "onion", 8.99]] 

when writing code this, sure expose intermediate results in order able test what's happening. implementation takes data , dumps directly in database in 1 shot, making hard tell something's going wrong if doesn't work out properly. version composed of several methods, each of has more specific purpose.

it's not clear in original example why you're resolving group_id @ all, sample output has nothing that, example i've included method resolves them , keeps them cached, avoiding repeated lookups of same thing. larger scale import you'd load in many rows, extract out distinct group_id values, load them @ once, , remap them before inserting.


Comments

Popular posts from this blog

image - ClassNotFoundException when add a prebuilt apk into system.img in android -

I need to import mysql 5.1 to 5.5? -

Java, Hibernate, MySQL - store UTC date-time -