Internet of Things Printer Project and CP437 chars
Moderators: adafruit_support_bill, adafruit

Internet of Things Printer Project and CP437 chars

by andersks on Thu Mar 08, 2012 7:55 am

Hello everyone!

Just got my first arduino (The Internet of Things Printer Project) in the mail, and can happily report that it was really smooth sailing to get it up and running (including modifying the sketch to include a timezone shift from UTC to GMT+1) in the Gutenbird-sketch.

But when I was to follow Norwegian tweets that included non-ASCII characters things got a little more tricky for me. The sketch changes all of the characters in question to -'s. Now, the printer itself is capable of printing the characters in question (ø,æ,Æ,å,Å(CP437)) and I've been able to print the characters when writing them directly into the sketch, but I cannot seem to figure out how to change the characters coming from twitter before they are changed into the '-'s. I've been trying adding lines to the unidecode function, but somehow I suspect that that's not all I will have to do;)

Would anyone be so kind as to give me a hint? ;)

Thanks so much for your help!
Best Regards
Anders
andersks
 
Posts: 2
Joined: Mon Feb 27, 2012 7:42 pm

Re: Internet of Things Printer Project and CP437 chars

by pburgess on Thu Mar 08, 2012 3:36 pm

Hi Anders,

Provided that the characters are already in the printer's character set (not needing custom bitmaps or whatnot), it should be relatively straightforward. The unidecode() function will need to be amended to remap certain Unicode values to the printer's faux-ASCII equivalents. You can get a map of the printer's charset by holding down the paper feed button when turning it on (rows = high 4 bits, columns = low 4 bits, e.g. ASCII 'A' = 0x41), and Wikipedia has a huge list of Unicode characters here.

Code: Select all | TOGGLE FULL SIZE
int unidecode(byte len) {
  int c, v, result = 0;
  while(len--) {
    if((c = timedRead()) < 0) return -1; // Stream timeout
    if     ((c >= '0') && (c <= '9')) v =      c - '0';
    else if((c >= 'A') && (c <= 'F')) v = 10 + c - 'A';
    else if((c >= 'a') && (c <= 'f')) v = 10 + c - 'a';
    else return '-'; // garbage
    result = (result << 4) | v;
  }

  switch(result) {
    case 0x00EB: return 0xB8; // Latin Small Letter E with diaeresis
    case 0x00EF: return 0xBF; // Latin Small Letter I with diaeresis
    case 0x0436: return 0xE6; // Cyrillic Small Letter Zhe
    case 0x00A9: return 0xA9; // Copyright sign
    // ..etc...
  }

  return '-';
}


It's not necessary to do this with every character in the universe, just the ones you're really in need of translation. Given that the printer has a finite 8-bit character set and not everything can be translated, maybe I'll sit down at some point and power through this, or a few of us could each take on a block and add it back to the repository.
User avatar
pburgess
 
Posts: 2740
Joined: Sun Oct 26, 2008 2:29 am

Re: Internet of Things Printer Project and CP437 chars

by andersks on Thu Mar 08, 2012 8:23 pm

Thank you so much for your help pburgess, it woks perfectly now.

As some kind of thankyou I've done as much of the mapping as I could, following the example you showed me, with the wiki-link you sendt ;)

Have no idea how to add it back to the repository as you suggest (it could probably also be solved more elegantly than this list?), but here is at least something that seems to work (havent done very much testing on it though, so no guarantees from me:). Do feel free to modify/republish/whatever ;)

I also have an xls spreadsheet that I used to generate this in in case you/someone would want to implement it better (for example some kind of matrix instead of this list :)) where each value has its own cell hopefully making it easier to copy/paste? If you're interested I'll put it somwhere you can find it ;)

again, thank you very much!

Best regards
Anders

Code: Select all | TOGGLE FULL SIZE
int unidecode(byte len) {
  int c, v, result = 0;
  while(len--) {
    if((c = timedRead()) < 0) return -1; // Stream timeout
    if     ((c >= '0') && (c <= '9')) v =      c - '0';
    else if((c >= 'A') && (c <= 'F')) v = 10 + c - 'A';
    else if((c >= 'a') && (c <= 'f')) v = 10 + c - 'a';
    else return '-'; // garbage
    result = (result << 4) | v;
  }

  switch(result) {
   
      // control and basic latin that is not handeled elsewhere
      case 0x003C: return 0x3C; // < Less-than sign
      case 0x003E: return 0x3E; // > Greater-than sign
      case 0x007B: return 0x7B; // { Left Curly Bracket
      case 0x007D: return 0x7D; // } Right Curly Bracket
     
      //CP437 (printer charset) characters not handeled elsewhere
      //0x8X
      case 0x00C7: return 0x80; // Ç Latin Capital letter C with cedilla
      case 0x00FC: return 0x81; // ü Latin Small Letter U with diaeresis
      case 0x00E9: return 0x82; // é Latin Small Letter E with acute
      case 0x00E2: return 0x83; // â Latin Small Letter A with circumflex
      case 0x00E4: return 0x84; // ä Latin Small Letter A with diaeresis
      case 0x00E0: return 0x85; // à Latin Small Letter A with grave
      case 0x00E5: return 0x86; // å Latin Small Letter A with ring above
      case 0x00E7: return 0x87; // ç Latin Small Letter C with cedilla
      case 0x00EA: return 0x88; // ê Latin Small Letter E with circumflex
      case 0x00EB: return 0x89; // ë Latin Small Letter E with diaeresis
      case 0x00E8: return 0x8A; // è Latin Small Letter E with grave
      case 0x00EF: return 0x8B; // ï Latin Small Letter I with diaeresis
      case 0x00EE: return 0x8C; // î Latin Small Letter I with circumflex
      case 0x00EC: return 0x8D; // ì Latin Small Letter I with grave
      case 0x00C4: return 0x8E; // Ä Latin Capital letter A with diaeresis
      case 0x00C5: return 0x8F; // Å Latin Capital letter A with ring above
     
      //0x9X
      case 0x00C9: return 0x90; // É Latin Capital letter E with acute
      case 0x00E6: return 0x91; // æ Latin Small Letter Æ
      case 0x00C6: return 0x92; // Æ Latin Capital letter Æ
      case 0x00F4: return 0x93; // ô Latin Small Letter O with circumflex
      case 0x00F6: return 0x94; // ö Latin Small Letter O with diaeresis
      case 0x00F2: return 0x95; // ò Latin Small Letter O with grave
      case 0x00FB: return 0x96; // û Latin Small Letter U with circumflex
      case 0x00F9: return 0x97; // ù Latin Small Letter U with grave
      case 0x00FF: return 0x98; // ÿ Latin Small Letter Y with diaeresis
      case 0x00D6: return 0x99; // Ö Latin Capital letter O with diaeresis
      case 0x00DC: return 0x9A; // Ü Latin Capital Letter U with diaeresis
      case 0x00A2: return 0x9B; // ¢ Cent sign
      case 0x00A3: return 0x9C; // banned Pound sign
      case 0x00A5: return 0x9D; // ¥ Yen sign
      //case 0x0000: return 0x9E; // ₧ couldn't find unicode equivalent
      case 0x0192: return 0x9F; // ƒ Latin Small Letter F with hook
     
      //0xAX
      case 0x00E1: return 0xA0; // á Latin Small Letter A with acute
      case 0x00ED: return 0xA1; // í Latin Small Letter I with acute
      case 0x00F3: return 0xA2; // ó Latin Small Letter O with acute
      case 0x00FA: return 0xA3; // ú Latin Small Letter U with acute
      case 0x00F1: return 0xA4; // ñ Latin Small Letter N with tilde
      case 0x00D1: return 0xA5; // Ñ Latin Capital letter N with tilde
      case 0x00AA: return 0xA6; // ª Feminine Ordinal Indicator
      case 0x00BA: return 0xA7; // º Masculine ordinal indicator
      case 0x00BF: return 0xA8; // ¿ Inverted Question Mark
      //case 0x0000: return 0xA9; // ⌐ couldn't find unicode equivalent
      case 0x00AC: return 0xAA; // ¬ Not sign
      case 0x00BD: return 0xAB; // ½ Vulgar fraction one half
      case 0x00BC: return 0xAC; // ¼ Vulgar fraction one quarter
      case 0x00A1: return 0xAD; // ¡ Inverted Exclamation Mark
      case 0x00AB: return 0xAE; // « Left-pointing double angle quotation mark
      case 0x00BB: return 0xAF; // » Right-pointing double angle quotation mark
     
      //0xBX
      case 0x2591: return 0xB0; // ░ Light shade
      case 0x2592: return 0xB1; // ▒ Medium shade
      case 0x2593: return 0xB2; // ▓ Dark shade
      case 0x2233: return 0xB3; // │ Mathematical operator
      case 0x22A3: return 0xB4; // ┤ Mathematical operator
      //case 0x22: return 0xB5; // ╡ couldn't find unicode equivalent
      //case 0x22: return 0xB6; // ╢ couldn't find unicode equivalent
      //case 0x22: return 0xB7; // ╖ couldn't find unicode equivalent
      //case 0x22: return 0xB8; // ╕ couldn't find unicode equivalent
      //case 0x22: return 0xB9; // ╣ couldn't find unicode equivalent
      case 0x2225: return 0xBA; // ║ Mathematical operator
      //case 0x22: return 0xBB; // ╗ couldn't find unicode equivalent
      //case 0x22: return 0xBC; // ╝ couldn't find unicode equivalent
      //case 0x22: return 0xBD; // ╜ couldn't find unicode equivalent
      //case 0x22: return 0xBE; // ╛ couldn't find unicode equivalent
      //case 0x22: return 0xBF; // ┐ couldn't find unicode equivalent
     
      //0xCX
      //case 0x0000: return 0xC0; // └ couldn't find unicode equivalent
      case 0x22A5: return 0xC1; // ┴ Mathematical operator
      case 0x22A4: return 0xC2; // ┬ Mathematical operator
      case 0x22A6: return 0xC3; // ├ Mathematical operator
      case 0x2212: return 0xC4; // ─ Mathematical operator
      //case 0x0000: return 0xC5; // ┼ couldn't find unicode equivalent
      case 0x22A7: return 0xC6; // ╞ Mathematical operator
      case 0x22A9: return 0xC7; // ╟ Mathematical operator
      //case 0x0000: return 0xC8; // ╚ couldn't find unicode equivalent
      //case 0x0000: return 0xC9; // ╔ couldn't find unicode equivalent
      //case 0x0000: return 0xCA; // ╩ couldn't find unicode equivalent
      //case 0x0000: return 0xCB; // ╦ couldn't find unicode equivalent
      //case 0x0000: return 0xCC; // ╠ couldn't find unicode equivalent
      //case 0x0000: return 0xCD; // ═ couldn't find unicode equivalent
      //case 0x0000: return 0xCE; // ╬ couldn't find unicode equivalent
      //case 0x0000: return 0xCF; // ╧ couldn't find unicode equivalent
     
      //0xDX
      //case 0x0000: return 0xD0; // ╨ couldn't find unicode equivalent
      //case 0x0000: return 0xD1; // ╤ couldn't find unicode equivalent
      //case 0x0000: return 0xD2; // ╥ couldn't find unicode equivalent
      //case 0x0000: return 0xD3; // ╙ couldn't find unicode equivalent
      //case 0x0000: return 0xD4; // ╘ couldn't find unicode equivalent
      //case 0x0000: return 0xD5; // ╒ couldn't find unicode equivalent
      //case 0x0000: return 0xD6; // ╓ couldn't find unicode equivalent
      //case 0x0000: return 0xD7; // ╫ couldn't find unicode equivalent
      //case 0x0000: return 0xD8; // ╪ couldn't find unicode equivalent
      //case 0x0000: return 0xD9; // ┘ couldn't find unicode equivalent
      //case 0x0000: return 0xDA; // ┌ couldn't find unicode equivalent
      case 0x2588: return 0xDB; // █ Full block
      case 0x2584: return 0xDC; // ▄ Lower half block
      case 0x258C: return 0xDD; // ▌ Left half block
      case 0x2590: return 0xDE; // ▐ Right half block
      case 0x2580: return 0xDF; // ▀ Upper half block
     
      //0xEX
      case 0x03B1: return 0xE0; // α Greek Small Letter Alpha
      case 0x03B2: return 0xE1; // ß Greek Small Letter Beta
      case 0x0393: return 0xE2; // Γ Greek Capital Letter Gamma
      case 0x03C0: return 0xE3; // π Greek Small Letter Pi
      case 0x03A3: return 0xE4; // Σ Greek Capital Letter Sigma
      case 0x03C3: return 0xE5; // σ Greek Small Letter Sigma
      case 0x00B5: return 0xE6; // µ Micro sign
      case 0x03A4: return 0xE7; // τ Greek Capital Letter Tau
      case 0x03A6: return 0xE8; // Φ Greek Capital Letter Phi
      case 0x0398: return 0xE9; // Θ Greek Capital Letter Theta
      case 0x03A9: return 0xEA; // Ω Greek Capital Letter Omega
      case 0x03B4: return 0xEB; // δ Greek Small Letter Delta
      case 0x221E: return 0xEC; // ∞ Mathematical operator
      case 0x03C6: return 0xED; // φ Greek Capital Letter Phi
      case 0x03B5: return 0xEE; // ε Greek Small Letter Epsilon
      case 0x2229: return 0xEF; // ∩ Mathematical operator
     
      //0xFX
      case 0x2261: return 0xF0; // ≡ Mathematical operator
      case 0x00B1: return 0xF1; // ± Plus-minus sign
      case 0x2265: return 0xF2; // ≥ Mathematical operator
      case 0x2264: return 0xF3; // ≤ Mathematical operator
      //case 0x22: return 0xF4; // ⌠ Mathematical operator
      //case 0x22: return 0xF5; // ⌡ Mathematical operator
      case 0x00F7: return 0xF6; // ÷ Mathematical operator
      case 0x2248: return 0xF7; // ≈ Mathematical operator
      case 0x00B0: return 0xF8; // ° Degree symbol
      case 0x2219: return 0xF9; // ∙ Mathematical operator
      case 0x22C5: return 0xFA; // · Mathematical operator
      case 0x221A: return 0xFB; // √ Mathematical operator
      //case 0x22: return 0xFC; // ⁿ
      case 0x00B2: return 0xFD; // ² Superscript two
      case 0x2596: return 0xFE; // ■ Mathematical operator

  }

  return '-';
andersks
 
Posts: 2
Joined: Mon Feb 27, 2012 7:42 pm

Re: Internet of Things Printer Project and CP437 chars

by pburgess on Thu Mar 08, 2012 9:27 pm

Wow, fantastic! Thanks so much for preparing that, it's no small effort. I'll work that into the next update, and will credit you in the code.
User avatar
pburgess
 
Posts: 2740
Joined: Sun Oct 26, 2008 2:29 am

Re: Internet of Things Printer Project and CP437 chars

by Dan Malec on Wed Mar 21, 2012 12:36 am

Apologies for reviving an older thread, but I've been playing around with box drawing on the IoT printer and found a wikipedia entry on Unicode 6.1 box drawing characters http://en.wikipedia.org/wiki/Box-drawing_character#Unicode. I thought I could fill in the some more of the 0xBx, 0xCx, & 0xDx cases from Anders's code. I've grabbed those three regions in their entirety - where there was no mapping, I've replaced the case value with the box drawing code; where there was already a mathematical mapping that is also a box symbol, I've added the box drawing mapping as a fall through case above it; everything else is left as is from Anders's code. I'm hoping that, if you decide to include the box characters, this will help with the merge.

Code: Select all | TOGGLE FULL SIZE
      //0xBX
      case 0x2591: return 0xB0; // ░ Light shade
      case 0x2592: return 0xB1; // ▒ Medium shade
      case 0x2593: return 0xB2; // ▓ Dark shade
      case 0x2502:              // │ Vertical Line (1V)
      case 0x2233: return 0xB3; // │ Mathematical operator
      case 0x2524:              // ┤ Left facing tee (1V/1H)
      case 0x22A3: return 0xB4; // ┤ Mathematical operator
      case 0x2561: return 0xB5; // ╡ Left facing tee (1V/2H)
      case 0x2562: return 0xB6; // ╢ Left facing tee (2V/1H)
      case 0x2556: return 0xB7; // ╖ Top right corner (2V/1H)
      case 0x2555: return 0xB8; // ╕ Top right corner (1V/2H)
      case 0x2563: return 0xB9; // ╣ Left facing tee (2V/2H)
      case 0x2551:              // ║ Vertical Line (2V)
      case 0x2225: return 0xBA; // ║ Mathematical operator
      case 0x2557: return 0xBB; // ╗ Top right corner (2V/2H)
      case 0x255D: return 0xBC; // ╝ Bottom right corner (2V/2H)
      case 0x255C: return 0xBD; // ╜ Bottom right corner (2V/1H)
      case 0x255B: return 0xBE; // ╛ Bottom right corner (1V/2H)
      case 0x2510: return 0xBF; // ┐ Top right corner (1V/1H)
     
      //0xCX
      case 0x2514: return 0xC0; // └ Bottom left corner (1V/1H)
      case 0x2534:              // ┴ Top facing tee (1V/1H)
      case 0x22A5: return 0xC1; // ┴ Mathematical operator
      case 0x252C:              // ┬ Bottom facing tee (1V/1H)
      case 0x22A4: return 0xC2; // ┬ Mathematical operator
      case 0x251C:              // ├ Right facing tee (1V/1H)
      case 0x22A6: return 0xC3; // ├ Mathematical operator
      case 0x2501:              // ─ Horizontal Line (1H)
      case 0x2212: return 0xC4; // ─ Mathematical operator
      case 0x253C: return 0xC5; // ┼ Plus (1V/1H)
      case 0x255E:              // ╞ Right facing tee (1V/2H)
      case 0x22A7: return 0xC6; // ╞ Mathematical operator
      case 0x255F:              // ╟ Right facing tee (2V/1H)
      case 0x22A9: return 0xC7; // ╟ Mathematical operator
      case 0x255A: return 0xC8; // ╚ Bottom left corner (2V/2H)
      case 0x2554: return 0xC9; // ╔ Top left corner (2V/2H)
      case 0x2569: return 0xCA; // ╩ Top facing tee (2V/2H)
      case 0x2566: return 0xCB; // ╦ Bottom facing tee (2V/2H)
      case 0x2560: return 0xCC; // ╠ Right facing tee (2V/2H)
      case 0x2550: return 0xCD; // ═ Horizontal Line (2H)
      case 0x256C: return 0xCE; // ╬ Plus (2V/2H)
      case 0x2567: return 0xCF; // ╧ Top facing tee (1V/2H)
     
      //0xDX
      case 0x2568: return 0xD0; // ╨ Top facing tee (2V/1H)
      case 0x2564: return 0xD1; // ╤ Bottom facing tee (1V/2H)
      case 0x2565: return 0xD2; // ╥ Bottom facing tee (2V/1H)
      case 0x2559: return 0xD3; // ╙ Bottom left corner (2V/1H)
      case 0x2558: return 0xD4; // ╘ Bottom left corner (1V/2H)
      case 0x2552: return 0xD5; // ╒ Top left corner (1V/2H)
      case 0x2553: return 0xD6; // ╓ Top left corner (2V/1H)
      case 0x256B: return 0xD7; // ╫ Plus (2V/1H)
      case 0x256A: return 0xD8; // ╪ Plus (1V/2H)
      case 0x2518: return 0xD9; // ┘ Bottom right corner (1V/1H)
      case 0x250C: return 0xDA; // ┌ Top left corner (1V/1H)
      case 0x2588: return 0xDB; // █ Full block
      case 0x2584: return 0xDC; // ▄ Lower half block
      case 0x258C: return 0xDD; // ▌ Left half block
      case 0x2590: return 0xDE; // ▐ Right half block
      case 0x2580: return 0xDF; // ▀ Upper half block
User avatar
Dan Malec
 
Posts: 13
Joined: Fri Nov 18, 2011 3:53 pm