Ruby的HexaPDF用来生成pdf和合并pdf代码示例原创邦

Ruby的HexaPDF用来生成pdf和合并pdf代码示例

  def generate_pdf(cover)
    path = "path/to/your/file"
    filepath = "path/to/your/file/my.pdf"
    FileUtils.mkdir_p(path) unless File.exists?(path)
    if File.exists?(filepath) && cover == true  # 重新生成
      file_size = File.size(filepath)
      ctime = File.ctime(filepath)
      puts "File size is #{file_size} bytes. File created at #{ctime}."
      File.delete(filepath)
    else  #文件已经存在返回
      return if File.exists?(filepath)
    end

    html_content = '<br>　　<img src="https://site.com/system/tybbs/p/m/141756965.jpg?x-oss-process=style/mob640" original="http://img3.laibafile.cn/p/m/141756965.jpg"><br><br>　　最后给大家发张图片吧，是个小游戏：股票预测<br>　　不知道有没有朋友玩过，可以看涨看跌，得分来自于你预测的股票是否跑赢大盘<br>　　这个号主要开来用博客记录自己操作的股票，10只票有2只得分为负，2个负都挺冤的<br>　　因为我盘中不在，所以只能晚上收盘开始预测，然后第二日早开盘，以开盘价为预测价<br>　　结束预测也是今晚提交，明日上午以开盘价结束<br>　　1只是迪安诊断，这个最冤，晚上提交结束时得分是正的，第二日开盘低开结束预测的<br>　　1只东北电气，4月26日，提交预测的，2号开盘起算的，盘中不在，无法结束，又是个白负的。<br>　　这10只票里面有几只是我最近操作的票，<br>　<img src="https://site.com/system/tybbs/p/m/141756965.jpg?x-oss-process=style/mob640" original="http://img3.laibafile.cn/p/m/141756965.jpg">　23日晚选出的迪安诊断，<br>　　25日选出的姚记扑克<br>　　26日选的瑞丰光电<br>　　2日选的罗平锌电<br>　　3日选的昌九生化<br>　　7日选的康得新<br>　　除了没卖的康得新和昌九生化<br>　　都是以盈利出局的，5月份到现在才6个交易日<br>　　已逮住2个涨停了，姚记扑克和昌九生化，明天晚上会发微博内容证实<br><br>　　其实不想发这些的，主要怕有人来捣乱，你们知道的，这个是蛋疼的天涯<br>　　动不动就想要人实盘的天涯<br><br><br><br><br><br><br><br><br><br><br><br>'.gsub("<br>", "\n")
    composer = HexaPDF::Composer.new
    composer.document.config['font.map'] = {
      'SimHei' => {
        none: 'lib/assets/fonts/SimHei.ttf'
      }
    }

    composer.style(:base, line_spacing: 1.5, last_line_gap: true, align: :justify, font: ['SimHei', variant: :none], font_size: 20)
    composer.style(:image, border: {width: 1}, padding: 5, margin: 10)
    composer.text(self.title, font: ['SimHei', variant: :none], font_size: 30, align: :center)

    html_content.split(/(<img.*>)/).each_with_index do |x, i|
      #composer.text(i.to_s, position: :flow)
      imgs = DigUrl.new(x.to_s, '"').urls
      #pp imgs

      if imgs.length > 0
        imgs = imgs.select!{|x| x.to_s.include?('site')}
        imgs.each do |url|
          url = url.to_s.gsub('https', 'http').split('?')[0]
          composer.text(url.to_s, position: :flow)
          image = Down.download(url)
          composer.image(image)
        end
      else
        composer.text(x, position: :flow)
      end
    end

    #composer.image(image, style: :image, width: 200, position: :float)
    #composer.image(image, style: :image, width: 200, position: :absolute, position_hint: [200, 300])
    composer.write(filepath)
    tmpfile = "public/p/#{self.id}/1.pdf"
    merge_pdf(filepath, [filepath, tmpfile])
  end

  def merge_pdf(name, pdfs)
    target = HexaPDF::Document.new
    pdfs.each do |file|
      pdf = HexaPDF::Document.open(file)
      pdf.pages.each {|page| target.pages << target.import(page)}
    end
    target.write(name, optimize: true)
  end

为什么需要合并？
pdf文件太大，生成完整pdf需要占用大量的时间和资源，最后4G的内存被耗尽而失败。打算按照：“主pdf文件 = 主pdf文件+追加pdf"，每次只是生成一个”追加pdf”，并合并到“主pdf文件”里面。不过这还只是单线程，处理起来可能还是比较慢，如果要多线程处理，只需要同时生成多个"追加pdf"即可。

现在的问题是：合并的效率高不高呢？如何才能资源最小化同时最快的成功生成pdf？
可以通过实际案例做测试。

阅读量: 1209
发布于: 2023-10-16
修改于: 2023-10-16