OCR using the open Google Cloud Vision API

8 years ago
Just create a new api key in the developer console, replace it in the code below and it should work. Then you can call "whatever_you_call_the_file.rb /Downloads/some_image_with_text.jpg" and google will OCR-it and the script will print out the recognized text.
require "base64"
require 'net/http'
require 'json'

# Base 64 the input image
b64_data = Base64.encode64(File.open(ARGV[0], "rb").read)

# Stuff we need
api_key = ""
content_type = "Content-Type: application/json"
url = "https://vision.googleapis.com/v1/images:annotate?key=#{api_key}"
data = {
  "requests": [
    {
      "image": {
        "content": b64_data
      },
      "features": [
        {
          "type": "TEXT_DETECTION",
          "maxResults": 1
        }
      ]
    }
  ]
}.to_json

# Make the request
url = URI(url)
req = Net::HTTP::Post.new(url, initheader = {'Content-Type' =>'application/json'})
req.body = data
res = Net::HTTP.new(url.host, url.port)
res.use_ssl = true

# res.set_debug_output $stderr

detected_text = ""
res.start do |http| 
  puts "Querying Google for image: #{ARGV[0]}"
  resp = http.request(req)
  # puts resp
  json = JSON.parse(resp.body)
  # puts json
  if json && json["responses"] && json["responses"][0]["textAnnotations"] && json["responses"][0]["textAnnotations"][0]["description"]
    detected_text = json["responses"][0]["textAnnotations"][0]["description"]
  end
end

puts "Google says the image is: #{detected_text}"