You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

162 lines
8.2KB

  1. <?xml version="1.0"?>
  2. <rss version="2.0">
  3. <channel>
  4. <title>Hugot Blog</title>
  5. <link>https://hugot.nl/blog.html</link>
  6. <description>Hugo's personal blog</description>
  7. <language>en-us</language>
  8. <pubDate>Thu, 16 Apr 2020 08:36:12 +0200</pubDate>
  9. <lastBuildDate>Thu, 16 Apr 2020 08:36:12 +0200</lastBuildDate>
  10. <docs>http://blogs.law.harvard.edu/tech/rss</docs>
  11. <generator>Hugo's Custom Bash Script</generator>
  12. <managingEditor>social@hugot.nl (Hugot)</managingEditor>
  13. <webMaster>infra@hugot.nl (Hugot Infra)</webMaster>
  14. <item><title> How To Use Your Email Client For Physical Mail </title><link>https://hugot.nl/posts/use-your-mail-client-for-physical-mail/index.html</link><description> &lt;h1&gt;
  15. How To Use Your Email Client For Physical Mail
  16. &lt;/h1&gt;
  17. &lt;p&gt;
  18. Whether it&amp;#39;s to re-read a conversation, find a plane ticket I ordered or
  19. check when a meeting was planned, I often find myself looking up old
  20. emails. It&amp;#39;s usually easy to do so because email clients are designed for
  21. the task: Many of them support full-text search and some even complement
  22. that with neat tagging and categorization systems. To be honest I have
  23. become completely dependent on those features for my day to day
  24. life. Having full-text search and some sort of categorization for email
  25. can be a huge time saver. When it comes to physical mail however, I still
  26. have to browse through stacks of paper to (hopefully) find what I&amp;#39;m
  27. looking for. I figured that it&amp;#39;d be nice to use my fancy email client to
  28. deal with physical mail as well, so I found a way to do just that. Turns
  29. out it&amp;#39;s pretty simple!
  30. &lt;/p&gt;
  31. &lt;p&gt;
  32. The main objective here is to transform our physical mail into an email
  33. that can be received, indexed and read by our email client of choice. Now,
  34. one way to do that would be to type the contents of our mail into an email
  35. by hand, but
  36. &lt;i&gt;
  37. ain&amp;#39;t nobody got time for that!
  38. &lt;/i&gt;
  39. . The (more appealing)
  40. alternative is to use a document scanner. I have a single purpose scanner
  41. unit from Canon that I hook up to my laptop for just this purpose.
  42. &lt;/p&gt;
  43. &lt;p&gt;
  44. It isn&amp;#39;t as simple as just emailing a scanned document to ourselves
  45. though: email clients are smart, but they can&amp;#39;t understand a word of text
  46. in our PDF or JPEG of a physical document. They need content to be in
  47. plain text form in order to provide us with some of their best features
  48. like full-text search. We&amp;#39;ll have to somehow transform our scanned
  49. documents into plain text that we can include in our email. To do this, we
  50. can use tesseract. Tesseract is an optical character recognition (OCR)
  51. engine, meaning that it can recognize text in images and extract it for
  52. us. Installing it should be easy on Debian derivative distros like
  53. Ubuntu. My laptop is running Debian unstable so I just ran
  54. &lt;code&gt;
  55. apt
  56. install tesseract
  57. &lt;/code&gt;
  58. and started using it. Using it is as easy as
  59. upening up a terminal and typing
  60. &lt;code&gt;
  61. tesseract FILE.jpg
  62. OUTPUT
  63. &lt;/code&gt;
  64. . That command will save all the text that tesseract is able
  65. to recognize in the image FILE.jpg to a file called OUTPUT.txt.
  66. &lt;/p&gt;
  67. &lt;aside&gt;
  68. &lt;i&gt;
  69. Side note: I am Dutch, so most of my physical mail is in Dutch. To
  70. make tesseract better understand my mail I installed the
  71. tesseract-ocr-nld package using
  72. &lt;code&gt;
  73. apt install
  74. tesseract-ocr-nld
  75. &lt;/code&gt;
  76. . You can check what other language packs are
  77. available by using
  78. &lt;code&gt;
  79. apt search tesseract-ocr
  80. &lt;/code&gt;
  81. .
  82. &lt;/i&gt;
  83. &lt;/aside&gt;
  84. &lt;p&gt;
  85. All we have to do from there is copy-paste the contents of that file into
  86. an email and send it to ourselves! Depending on the formatting of the
  87. input document, the output may not always be pleasant to read. We can
  88. account for this by including the original document as an attachment to
  89. the email. That way we get the best of both worlds: we can use the search
  90. functionality of our email client to find the document, and then read it
  91. in its original form by opening the attachment.
  92. &lt;/p&gt;
  93. &lt;p&gt;
  94. This is all easy enough, but I&amp;#39;m lazy. I didn&amp;#39;t feel like opening up my
  95. email client and doing manual copy-pasting, so I decided to automate the
  96. process a little further. I have postfix setup on my system to relay to my
  97. mail server, so I can simply use the
  98. &lt;code&gt;
  99. mail
  100. &lt;/code&gt;
  101. command to send
  102. emails without a GUI mail client. I combined that with tesseract in a
  103. little bash script. The script iterates through all of its arguments and
  104. interprets them as filenames of scanned documents. It calls tesseract to
  105. extract text from them, concatenates the results, attaches the files to an
  106. email and sends it to my personal email address. Now all I have to do is
  107. run the script with filenames of some documents and my job is done. If
  108. anyone is interested in an actual program that does the same thing and
  109. doesn&amp;#39;t require you to setup postfix, let me know! I might consider
  110. authoring one if it&amp;#39;s useful to more people than just myself. The script
  111. I&amp;#39;m currently using can be found
  112. &lt;a href=&quot;scan-to-mailpile.bash.html&quot;&gt;
  113. here
  114. (pretty)
  115. &lt;/a&gt;
  116. and
  117. &lt;a href=&quot;scan-to-mailpile.bash&quot;&gt;
  118. here (raw)
  119. &lt;/a&gt;
  120. , but I
  121. don&amp;#39;t recommend using it if you don&amp;#39;t fully understand its contents, it&amp;#39;s
  122. not a polished user experience 🤓.
  123. &lt;/p&gt;</description><pubDate>Mon, 17 Feb 2020 11:55:42 +0100</pubDate><guid isPermaLink="false"> How To Use Your Email Client For Physical Mail NDc2MDg1MjYxIDQxODUK</guid>
  124. </item>
  125. <item><title> Creating a Simple Static Blog </title><link>https://hugot.nl/posts/simple-static-blog/index.html</link><description></description><pubDate>Sat, 08 Feb 2020 12:14:16 +0100</pubDate><guid isPermaLink="false"> Creating a Simple Static Blog MjU5OTIyNDIwMyA2MTI5Cg==</guid>
  126. </item>
  127. <item><title> Introduction </title><link>https://hugot.nl/posts/introduction/index.html</link><description> &lt;h1&gt;
  128. Introduction
  129. &lt;/h1&gt;
  130. &lt;p&gt;
  131. Hello, welcome to my blog! My name is Hugo. I am a 22 year old Software Engineering
  132. student from the Netherlands. Software development is a huge part of my life, I write a
  133. lot of (weird) programs to scratch my own itch and most software I create
  134. is
  135. &lt;a href=&quot;https://github.com/hugot&quot;&gt;
  136. open source
  137. &lt;/a&gt;
  138. by default. I also run a one-man
  139. company that provides some IT services on the side.
  140. &lt;/p&gt;
  141. &lt;p&gt;
  142. Between working on projects and studying I like to watch movies &amp;amp; series, listen to music
  143. &amp;amp; podcasts, ride my road bike and take hikes.
  144. &lt;/p&gt;
  145. &lt;h2&gt;
  146. What kind of blog is this?
  147. &lt;/h2&gt;
  148. &lt;p&gt;
  149. Because I&amp;#39;m quite new to this and I want to keep myself interested, I won&amp;#39;t be
  150. limiting myself to a single topic. You can expect me to post about a variety of topics
  151. that may interest/annoy/excite me at any given moment.
  152. &lt;/p&gt;
  153. &lt;p&gt;
  154. May my posts be interesting and my posting schedule be consistent 🤓🖖
  155. &lt;/p&gt;
  156. &lt;p&gt;
  157. I hope to see you around! - Hugo
  158. &lt;/p&gt;</description><pubDate>Sat, 08 Feb 2020 09:30:06 +0100</pubDate><guid isPermaLink="false"> Introduction MzYzMzkyNDgwOCA5MDcK</guid>
  159. </item>
  160. </channel>
  161. </rss>