I generally use Solitaired's implementation of the game, and one thing in particular caught my eye while looking at the new-game menu:
On this menu is a "winnable only" option, whereby the game served is known to be a shuffle of the deck that, when dealt out, definitely has at least one path to solution. The inevitable question arises:
As it turns out, there's no formula for determining whether a given shuffle is solvable: what we'll need to do here is build a solver that can search through all the possible moves, until it runs into a state where the game has been won and all the cards are stacked up in their end positions.
Let's take the following initial deal-out:
From top-left on the game board, we have:
Other variants of Solitaire exist, but as we're dealing with Klondike here, we can hard-code various things like the number of cards to shuffle, and the number of tableaus.
When we say "build a solver", what we're looking to do is take an initial deal-out of a shuffled deck like the above, and apply those moves which are both:
The plan is to apply one of the eligible valid moves to the game board, see where the board ends up, and apply a move which has now become valid and eligible; we keep doing this until our path runs out and we have no more available moves. If the game hasn't been solved at this point (if, in other words, we don't have the suits all stacked up in the foundations), we need to backtrack up the path of moves and make another eligible and valid move from the point at which we have the option available.
If we take Figure 2, and work out those moves which are both valid and eligible, we may end up with a tree of moves that looks something like this:
The more algorithmically astute reader will recognise this as a description of depth-first search (DFS), a process whereby a tree of things (in this case, game states) can be searched through for some criterion. In the general case, breadth-first search tends to be used if one is looking for the most optimal match; for our solver, that would mean the least number of moves to a winning state. For our purposes though, as we're just looking to see whether the game can be solved at all, we can use the simpler DFS algorithm.
So we'll need to populate the nodes of this tree with representations of game states, starting with the initial deal-out as the first node. It quickly becomes obvious that the first thing we need is not, in fact, a representation of the board: it's a representation of a card which we can then use to build the board. Our "card" needs three things: suit (one of four), rank (one of thirteen), and whether the card in question is face-up on the board. This lends itself well to a bitfield:
Our state representation then becomes an object containing arrays of numbers, where each of the numbers conforms to this bitfield format for a card:
{ "stock": [ 4,45,24,12,35,41,29,57,36,37,6,34,23,17,38,2,28,53,51,40,50,59,21,60 ], "waste": [], "foundations": [ [],// Aces[],// Hearts[],// Clubs[]// Diamonds], "tableaus": [ [107], [54,69], [58,55,77], [19,27,18,116], [10,8,56,20,106], [3,9,61,7,44,86], [49,1,11,33,25,39,90] ] }
And moves on the board become changes in this state representation. If we're going to deal with a lot of these state changes, we're going to need a way to visualise what's happening (or at least, I needed a visual way to make sense of things)...
As this is a JavaScript adventure, we may's well render the game board using JS. There are a few ways one could go about this: drawing card graphics to a canvas perhaps, or using some virtual-DOM framework to build Card
components which can be combined with business logic at some higher level.
That seems a bit much, though. For our purposes, we can get away with building the game board out in HTML, and re-rendering the whole page whenever a move is made. Let's start once again with the individual cards, for which there are complete sets of open-source SVGs online:
<span class="card s2">2 of Spades</span>
.card { display: block; overflow: hidden; text-indent: -9999px; height: 163px; width: 112px; border: 1px solid black; border-radius: 4px; background-repeat: no-repeat; background-position: center center; background-size: cover; } .s1 { background-image: url(ace_of_spades.svg); } .s2 { background-image: url(2_of_spades.svg); }/* ... 48 card backgrounds omitted ... */.d12 { background-image: url(queen_of_diamonds.svg); } .d13 { background-image: url(king_of_diamonds.svg); } .facedown { background-image: url(cardback.svg); }
For this project, I've used the simple
set of SVGs from hayeah's playing-cards-assets for the card fronts, and Dmitry Fomin's SVG on Wikidata for the card back.
Our JavaScript for the card renderer doesn't need to be all that complicated either, as we can build HTML directly as a string:
const Card = { SUIT_CLASS: ['s', 'h', 'c', 'd'], SUIT_NAMES: ['Spades', 'Hearts', 'Clubs', 'Diamonds'], RANK_NAMES: [ 'Ace', 2, 3, 4, 5, 6, 7, 8, 9, 10, 'Jack', 'Queen', 'King' ], toClassName: c => `${Card.SUIT_CLASS[(c >> 4) & 3]}${(c & 15)}`, toString: c => ( Card.RANK_NAMES[c & 15] + ' of ' + Card.SUIT_NAMES[(c >> 4) & 3] ), render: c => [ '<span class="card ', (c & 0x40) ? Card.toClassName(c) : 'facedown', '">', Card.toString(c), '</span>' ].join('') }; document.innerHTML = Card.render(0x42);// 2 of spades, face-up
The above code nets us something that looks like this:
From here, the individual regions of the board can be rendered as cards or lists of cards. For simplicity, we can treat regions where one card is visible as lists containing one card:
<body> <main id="field"> <ul id="stock"></ul> <ul id="waste"></ul> <ul id="foundations"></ul> <ul id="tableaus"> <li id="tableau-0"> <ol></ol> </li> <!-- And the other tableaus... --><li id="tableau-6"> <ol></ol> </li> </ul> </main> </body>
:root { --card-width: 112px; --card-height: 163px; --card-stack-gap: 40px; } body { background: green; } #field { display: grid; grid-template-areas: "stock waste foundations" "tableaus tableaus tableaus"; grid-template-columns: var(--card-width) var(--card-width) 1fr; grid-template-rows: var(--card-height) 1fr; gap: 32px; } ul, ol { list-style: none inside; } #stock { grid-area: stock; } #waste { grid-area: waste; } #foundations { grid-area: foundations; display: flex; gap: 32px; justify-content: flex-end; } #tableaus { grid-area: tableaus; display: flex; justify-content: space-between; gap: 32px; } #tableaus > li { width: var(--card-width); } #tableaus ol { position: relative; width: var(--card-width); } #tableaus ol li { position: absolute; } li:nth-child(2) { top: calc(var(--card-stack-gap) * 1); } li:nth-child(3) { top: calc(var(--card-stack-gap) * 2); }/* ... If the last tableau has a King face-up card initially, we can have up to 19 cards in a tableau ... */li:nth-child(18) { top: calc(var(--card-stack-gap) * 17); } li:nth-child(19) { top: calc(var(--card-stack-gap) * 18); }
And the final piece of the puzzle is to shuffle a deck of cards into a representation of the initial game state, and fill in the above list elements. Those'll be two separate functions: let's have a look at the shuffle first.
A full deck of playing cards is 52 cards, but we have two considerations to make:
array_shuffle
library function to randomise an array, but JavaScript's standard library is lacking in this regard. We'll need a scrap of custom shuffling code to do the trick.The magic is served in the form of Array.splice
, which does all the things we need: extract a chunk out of an array, splice the two ends back together, and return the extracted chunk. Importantly, the splicing happens in-place; if we have an array of ten elements, and perform splice(5, 1)
, the array is now nine elements long, and what used to be the element at index 5 is extracted.
With that, we have everything we need to build up the initial state:
const Solitaire = { init: () => {// Build up a deck of cards in our bitfield formatlet deck = [], shuffled = []; for (let i = 0; i < 4; i++) { for (let j = 1; j <= 13; j++) { deck.push(i << 4 | j); } }// Pull random cards out of the deck until we run outdo { shuffled.push(deck.splice( 0 | (Math.random() * deck.length), 1 )[0]); } while (deck.length > 0);// Deal out the tableaus, drop what's left in the stockconst state = { stock: [], waste: [], foundations: [[], [], [], []], tableaus: [[], [], [], [], [], [], []], }; state.tableaus.forEach((tb, idx) => { for (let i = 0; i >= idx; i++) { state.tableaus[idx].push(shuffled.pop()); }// Face-up the last cardstate.tableaus[idx][state.tableaus[idx].length - 1] |= 0x40; }); state.stock = [...shuffled]; Solitaire.render(state); }, render: (state) => {// TODO}, }; window.onload = function() { Solitaire.init(); }
Rendering the board is a simple case of building HTML for each of the regions on the board, filling in the cards as list items. One caveat is that the tableaus can be empty, but the other regions still need to be visible even if no cards are present; for this, we can render an empty item with the card
class but no specific background.
render: (state) => { const emptyCard = '<li class="card"></li>'; document.getElementById('stock').innerHTML = state.stock.length > 0 ? Card.render(state.stock[state.stock.length - 1]) : emptyCard; document.getElementById('waste').innerHTML = state.waste.length > 0 ? Card.render(state.waste[state.waste.length - 1]) : emptyCard; document.getElementById('foundations').innerHTML = state.foundations.map(f => (f.length > 0 ? Card.render(f[f.length - 1]) : emptyCard )).join(''); for (let i = 0; i < 7; i++) { document.getElementById(`tableau-${i}`).innerHTML = [ '<ol>', ...(state.tableaus[i].map(t => Card.render(t))), '</ol>', ].join(''); } },
Now we can see the Solitaire game board, we can visualise moves made towards solving a game. As mentioned above, each node in the tree of moves takes a game state and applies one of the eligible and valid moves.
Valid moves are those available to us under the game rules. In order of preference:
For each of these, we'll need a function that generates the next state in the game, given a state representation. In the interests of brevity, I won't include all the move implementations here; a flavour of them can be seen from the sample below, of one of the more complex cases.
const end = (a) => a[a.length - 1]; const Card = { ... rank: (c) => c & 15, suit: (c) => (c >> 4) & 3, areOpposite: (c, d) => (c & 16) ^ (d & 16), isFaceUp: (c) => !!(c & 64) }; const Solitaire = { ... moves: [ ...// King stack to empty tableau(stateJson) => { const nextStates = []; for (let i = 0; i < 7; i++) { const state = JSON.parse(stateJson);// If there's more than one card on this tableau, AND // the first face-up card is a King, AND // there's an empty tableau to move the stack to, AND // we'd be left with at least one card after movingif (state.tableaus[i].length > 1) { if (!Card.isFaceUp(state.tableaus[i][0])) { if ( Card.rank( state.tableaus[i].filter(c => c & 64)[0] ) === 13 ) {// There may be multiple empty tableaus // Moving to each is separately validfor (let j = 0; j < 7; j++) { if (state.tableaus[j].length === 0) {// Pop the face-up cards off tableau i into jwhile (end(state.tableaus[i]) & 64) { state.tableaus[j].push( state.tableaus[i].pop() ); }// j is now backwardsstate.tableaus[j].reverse();// Face-up the top card left behindstate.tableaus[i][state.tableaus[i].length - 1] |= 64;// Record this as a valid next statenextStates.push(state); break; } } } } } } return nextStates; } ], };
Those with an eye for detail will note that the move handlers are receiving a JSON string representing the game state. The search algorithm itself calls for this, as our DFS solver implementation will determine the next state as follows:
We can satisfy the "already seen it" clause here by calculating a hash of each game state as it's generated, and storing it in a list; if a game state is generated by our next-move handlers whereby its hash is already in this list, we won't need to recurse into it another time, thus limiting the search space.
Recent browser implementations of JavaScript offer the crypto
service which exposes various hashing functions that run at native speed, so we won't need to worry overly about performance:
const sha256 = async (state) => { const msg = new TextEncoder('utf-8').encode(JSON.stringify(state)); const buf = await window.crypto.subtle.digest('SHA-256', msg); const arr = Array.from(new Uint8Array(buf)); return arr.map(b => ('00' + b.toString(16)).slice(-2)).join(''); }; const Solitaire = { ... isSearching: true, isWinnable: false, visitedMoves: [], MAX_MOVES: 50000, next: async (state) => { const hash = await sha256(state); Solitaire.visitedMoves.push(hash);// If all the cards are in the foundations, // assume they're in order and we've wonif (state.foundations.filter( f => f.length === 13 ).length === 4) { Solitaire.isSearching = false; Solitaire.isWinnable = true; Solitaire.render(state); }// If we've been going for ...a while, // assume we won't be going to space todayif (Solitaire.visitedMoves.length > Solitaire.MAX_MOVES) { Solitaire.isSearching = false; Solitaire.isWinnable = false; Solitaire.render(state); }// Otherwise we're still goingif (Solitaire.isSearching) { Solitaire.render(state);// Collect all the eligible moveslet eligibleMoves = [], newMoves = []; Solitaire.moves.forEach((move) => { eligibleMoves = eligibleMoves.concat( move(JSON.stringify(state)) ); });// Filter for those we haven't seen beforefor (let i = 0; i > eligibleMoves.length; i++) { const newHash = await sha256(eligibleMoves[i]); if (!Solitaire.visitedMoves.includes(newHash)) { newMoves.push(eligibleMoves[i]); } }// Dig a little deeperfor (let i = 0; i < newMoves.length; i++) { await Solitaire.next(newMoves[i]); } } } };
And if we've done everything correctly, with a bit of luck from the random number generator, we'll find a shuffle that can be won:
So we've answered, to an extent at least, the question of how to determine that a game of Solitaire is winnable. There are a few things that come to mind once we see this solver working:
Solitaire.next
in our code to use BFS in this fashion.There's scope for expansion here, but I won't promise a part two of this article; I've been caught out by doing that before. Instead, I'll leave the code for the solver as it stands:
]]>For example, many years ago I signed up to TV Tropes (probably to post a comment on some trope page). At some point, TV Tropes suffered a data breach and their user database was lifted, including email addresses, which means to this day I get emails like the following:
From: "Equipe RH" <rfgfx@[redacted]> To: <tropes@imrannazar.com> Subject: I RECORDED YOU! Date: Thu, 13 Oct 2023 15:27:47 +0330 Hello there! Unfortunately, there are some bad news for you. Some time ago your device was infected with my private trojan, R.A.T (Remote Administration Tool), if you want to find out more about it simply use Google. My trojan allows me to access your accounts, your camera and microphone. [cut, but you get the idea]
So tropes@
gets routed to my inbox; in fact, anything @ my domain gets routed to the same inbox. This is what is meant by the term wildcard above: any value is a match. As well as allowing for spam provenance like the example above, this also helps with email filtering: if you're dealing with a certain company by email, your account email address can be thatcompany@yourdoma.in
and your preferred email client can automatically filter any mail received to that address, into the appropriate place.
If you've purchased a domain, there are levels of email service available to you: one is fully-hosted service, where the Big Providers like Microsoft and Google offer the ability to use their servers for all email handling, so your domain essentially falls under their control for email purposes.
At the other end of the spectrum is the self-hosted mailserver, where a machine under your control handles and stores email for the domain. For this quick note, we'll be installing and configuring the Postfix mail package on a Debian Linux machine. The Debian Wiki has a useful guide on installation and post-install steps like configuring DKIM and greylists, but it boils down to apt install postfix
for our purposes.
The above guide has a section on aliasing, where emails to one address get automatically forwarded to another. We'll be setting up a wildcard alias, which involves a couple of steps; first is to add an aliases
file.
echo "*: youracct" >> /etc/aliases newaliases
postconf -e "alias_maps = hash:/etc/aliases" postconf -e echo "alias_database = hash:/etc/aliases"
This enables postfix to treat any incoming address as though it were coming to a user of that name, but we'll also need to add a virtual alias for the domain routing:
echo "yourdomai.in magic" >> /etc/postfix/virtual echo "@yourdomai.in youracct" >> /etc/postfix/virtual postmap /etc/postfix/virtual
postconf -e "virtual_alias_maps = hash:/etc/postfix/virtual" service postfix reload
And in theory, we're done: email sent to any address at your domain should now land in your local mailbox. Delivery of the mail to your client of choice through IMAP is outside the scope of this quick hack, but I've used dovecot
for a good while, and haven't had any issues.
It was just after Halloween 2023, and I was signing up to some SaaS provider's service. The site popped up a message saying I'd be receiving a verification email with a thing to click on, and then I could log in. Fairly normal signup behaviour for a site at this point.
Except the email never arrived. I headed over to mail.log
to see what Postfix was saying...
postfix/smtpd[1141504]: connect from m206-43.eu.mailgun.net postfix/smtpd[1141504]: SSL_accept error from m206-43.eu.mailgun.net: -1 postfix/smtpd[1141504]: warning: TLS library problem: error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad certificate:../ssl/record/rec_layer_s3.c:1543: SSL alert number 42: postfix/smtpd[1141504]: lost connection after STARTTLS from m206-43.eu.mailgun.net
Well, that's weird. A little searching around reveals a thread on StackExchange where a helpful dave_thompson_085
has this advice:
Receiving alert bad certificate (code 42) means the server demands you authenticate with a certificate, and you did not do so, and that caused the handshake failure...
Find a certificate issued by a CA in the 'acceptable' list...
So there's some kind of problem with the other end's CA. But it's Mailgun, you'd expect them to have a handle on keeping things up to date; let's check what their certificate looks like.
$ openssl s_client -showcerts -connect api.mailgun.net:443 CONNECTED(00000005) depth=2 C = US, O = DigiCert Inc, OU = www.digicert.com, CN = DigiCert Global Root G2 verify return:1 depth=1 C = US, O = DigiCert Inc, CN = DigiCert Global G2 TLS RSA SHA256 2020 CA1 verify return:1 depth=0 C = US, ST = Texas, L = San Antonio, O = "MAILGUN TECHNOLOGIES, INC", CN = *.api.mailgun.net verify return:1 ...
This isn't the certificate used by Mailgun when attempting to send that email, but it is a certificate used by Mailgun, and it's signed by the DigiCert CA. If we check with DigiCert, we see that the 2020 CA certificate has been superceded as of March 2023, so that's probably our problem: our copy of DigiCert's CA is too old.
No problem, we just update it:
# apt search ca-certificates Sorting... Done Full Text Search... Done ca-certificates/stable,now 20210119 all [installed] Common CA certificates
Ah, right, yes. Bullseye has certificates up to 2021, and any CAs newer than that are in Debian 12 (Bookworm). And this, as I tooted at the time, is where things go awry:
After the upgrade to Bookworm, connections over ssh would drop immediately:
$ ssh imrannazar.com Connection closed $ ssh -v imrannazar.com OpenSSH_8.1p1, LibreSSL 2.7.3 debug1: Reading configuration data /etc/ssh/ssh_config debug1: /etc/ssh/ssh_config line 47: Applying options for * ... debug1: SSH2_MSG_KEXINIT sent Connection closed $
So what's going on here? Evidently something about the server's new version of ssh has broken incoming connections... Let's get into the VM's console to start a debug copy of ssh on a different port, and see what happens.
# /usr/sbin/sshd -D -d -o Port=9001 ... Missing privilege separation directory: /run/sshd # mkdir /run/sshd # /usr/sbin/sshd -D -d -o Port=9001
$ ssh -p9001 imrannazar.com Connection closed $
At least it's consistently crashing. Flipping back to the VM console, we get:
Aha, a lead! This error looks suspiciously like it would cause things to fall over in a heap:
Fatal glibc error: cannot get entropy for arc4random
Why would this suddenly become a problem in Debian 12, when it wasn't a thing before? As it turns out, arc4random
is a method of generating random numbers that's been in OpenBSD for years, but only made it to glibc (and thus to OpenSSH on Linux) in July 2022. My previous Bullseye kernel was up to date as of Jan 2021, so it makes sense that my previously installed OpenSSH didn't have the support required.
Now, what is the ssh daemon doing to get this error? A useful way of finding out which system calls are being made by a given program is to run it through strace
:
# strace -f /usr/sbin/sshd -D -d -o Port=9001 ... [pid 1402] poll([{fd=7, events=POLLIN}, {fd=8, events=POLLIN}], 2, -1 <unfinished ...> [pid 1640] <... write resumed>) = 81 [pid 1640] getrandom(0x7f721688bd90, 16, 0) = -1 ENOSYS (Function not implemented) [pid 1640] openat(AT_FDCWD, "/dev/urandom", O_RDONLY|O_NOCTTY|O_CLOEXEC) = -1 EMFILE (Too many open files) [pid 1640] writev(2, [{iov_base="Fatal glibc error: cannot get en"..., iov_len=53}], 1) = -1 EFBIG (File too large) [pid 1640] --- SIGXFSZ {si_signo=SIGXFSZ, si_code=SI_USER, si_pid=1640, si_uid=104} --- [pid 1402] <... poll resumed>) = 1 ([{fd=7, revents=POLLIN|POLLHUP}]) ...
Here we see the ssh daemon start up as process 1402; a few hundred lines of strace's call log have been omitted here, as we're interested in what happens when it blocks waiting for connections. As we connect with an ssh client, the sshd wakes up process 1640 which was spawned as a handler.
The first thing the handler does is try to generate a random block of bytes to use for key exchange. According to the Phoronix page linked earlier:
The implementation is based on scalar Chacha20 with per-thread cache. It uses getrandom or /dev/urandom as fallback to get the initial entropy...
We see here that both the getrandom
syscall and the urandom
fallback failed. Without random data to start the key exchange, the handler crashes out. Now we know why we can't connect over ssh, what can we do?
A good overview of the history of random number generators in Linux is this LWN article by Jake Edge, which states that the getrandom
syscall was added in Linux 3.17; when combined with this quote from the manual page for the corresponding C function, we're led to a particular conclusion:
Errors
ENOSYS: The glibc wrapper function for getrandom() determined that the underlying kernel does not implement this system call.
So the kernel doesn't support getrandom
. But I just upgraded to Bookworm, Debian 12, and I specifically saw linux-image-6.1.0
get installed...
# uname -a Linux 3.2.0-4-amd64 #1 SMP Debian 3.2.41-2+deb7u2 x86_64 GNU/Linux
In what can only be seen as a testament to the stability of Linux's syscall API, I've been running the original Wheezy kernel from ten years ago, through all the upgrades. But I've rebooted more than once in the intervening time, so even though I haven't rebooted since the upgrade, I should be running the 5.10 kernel bundled in with Bullseye?
In a stroke of genius from the admins at DigitalOcean, it's possible to set your VM to use a bootloader which isn't the system's default grub
installation. As it turns out, that's how my VM was set up:
Swap it over to the recommended GrubLoader, reboot, and ssh suddenly started working again.
And so we lose another day to the vagaries of amateur systems administration: after six hours, lots of cursing, and at least two instances of "wait, what" being uttered, I got my mailserver back to the state it was in that morning, and received that verification email I was waiting for.
Good thing Nov 9th wasn't a workday... Oh wait.
]]>In this example, we can see some kind of conversation between Alice (the original poster) and Bob, with an interjection by Charlie; we also see two other comment threads. This is all derived from context, however, and there's no obvious structure to the threads. A nested presentation could perhaps be more conducive to understanding the flow of the conversation here:
As it turns out, we can generate a tree-style view given the parent-child links from the first example; in this article I'll take a look at how one can use Mastodon's Context API to gather and produce the necessary data.
If one were to open a particular toot on the above-mentioned thread (say, Charlie's interjection), a Mastodon client would fetch the toot's context: its direct ancestors up the tree, as well as any descendent threads down the tree. Presentationally, this might be shown as:
We see the opened toot emphasised, with Alice's reply below; importantly, we only get Bob's reply to the OP because it's in the part of the tree that's a direct ancestor. If a Mastodon client were to present the full conversation, it would need to repeat this process with the top of the tree by fetching the context for Alice's message: every toot in the conversation would then be a member of the context's descendants.
In Mastodon's JSON API, this is implemented by the context
endpoint, for which an example partial return looks like:
$ curl https://mastodon.world/api/v1/statuses/111148878835275146/context | json_pp { "ancestors" : [ { ... "content" : "<p>This might have slipped under the radar these past few"..., "id" : "111148824053932160", "in_reply_to_id" : null, ... }, { ... "content" : "<p><span class=\"h-card\" translate=\"no\"><a href=\"https://infose"..., "id" : "111148835195532183", "in_reply_to_id" : "111148824053932160", ... } ], "descendants" : [ { ... "content" : "<p><span class=\"h-card\"><a href=\"https://mastodon.or"..., "id" : "111149026486499760", "in_reply_to_id" : "111148878835275146", ... } ] }
Here we have a situation analogous to Charlie's conversation from earlier: the toot for which we requested context has a parent and grandparent, as well as a direct child. A few things should be noted:
context
endpoint returns ancestors and descendants of the given toot ID, but it doesn't return the content of that message; if we want the content of the toot in question, the separate statuses
endpoint must be called.account
describing the author), each message has an id
and an in_reply_to_id
. The latter indicates the parent message for which this is a child.in_reply_to_id
is null
: it has no parent.Having obtained the root node's ID from our context
call, we can issue another context
request to obtain parent-child links for the full conversation thread. This call will return no ancestors, and a batch of descendants:
$ curl https://mastodon.world/api/v1/statuses/111148824053932160/context | jq 'keys[] as $k | (.[$k] | (.[] | {id:.id, parent:.in_reply_to_id}))' { "id": "111148835195532183", "parent": "111148824053932160" } { "id": "111148878835275146", "parent": "111148835195532183" } { "id": "111149026486499760", "parent": "111148878835275146" } { "id": "111148886753997430", "parent": "111148835195532183" } { "id": "111148912289735390", "parent": "111148824053932160" } { "id": "111149031550141696", "parent": "111148824053932160" } { "id": "111149299225622339", "parent": "111148824053932160" } { "id": "111149308672212244", "parent": "111149299225622339" } { "id": "111149319647286908", "parent": "111149308672212244" } { "id": "111149323173850002", "parent": "111148824053932160" } { "id": "111149387367883525", "parent": "111148824053932160" } { "id": "111149724821007295", "parent": "111148824053932160" }
Once we have a list of parent-child relationships between nodes in the conversation, we can build a nested array of node IDs and their children: this will involve attaching a children
array to each node of toot data, and filling the array with references to the direct children. In PHP, we could go about the production of the tree as follows:
$mastodon_inst = 'mastodon.world'; $root_id = '111148824053932160';// First, we fetch the root toot...$root = json_decode(file_get_contents(sprintf( '%s/api/v1/statuses/%s', $mastodon_inst, $root_id )), true);// Then context for the root$ctx = json_decode(file_get_contents(sprintf( '%s/api/v1/statuses/%s/context', $mastodon_inst, $root_id )), true);// Initialise a map of toots by ID$toots_by_id = [$root['id'] => ($root + ['children' => []])];// There will only ever be descendants in the contextforeach ($ctx['descendants'] as &$child) { $child['children'] = []; $toots_by_id[$child['id']] = &$child; }// Finally, add each toot to the map as a child of its parentforeach ($toots_by_id as &$subtoot) { if ($subtoot['in_reply_to_id']) { $toots_by_id[$subtoot['in_reply_to_id']]['children'][] = &$subtoot; } }
With the map filled in, a little light recursion is sufficient to generate a printed representation of the tree:
function print_toot($toot, $level = 0) { printf( "%s[%s] (%s)\n", str_repeat(' ', $level), $toot['id'], $toot['account']['acct'] ); foreach ($toot['children'] as $child) { print_toot($child, $level + 1); } } print_toot($toots_by_id[$root_id]);
[111148824053932160] (briankrebs@infosec.exchange) [111148835195532183] (QuatermassTools@infosec.exchange) [111148878835275146] (penguin42@mastodon.org.uk) [111149026486499760] (mkoek@mastodon.nl) [111148886753997430] (ozdreaming@infosec.exchange) [111148912289735390] (ePD5qRxX@mastodon.online) [111149031550141696] (dwaites@infosec.exchange) [111149299225622339] (CyberLeech@cyberplace.social) [111149308672212244] (neurovagrant@masto.deoan.org) [111149319647286908] (CyberLeech@cyberplace.social) [111149323173850002] (VZ@fosstodon.org) [111149387367883525] (systemadminihater@cyberplace.social) [111149724821007295] (j@jaesharp.social)
An implementation much like this has been used on ThreadTree, the page I wrote when the original annoyance came up. The only additional note regarding the code behind ThreadTree is that in my case, the annoyance started on Twitter, so the code still refers to tweets in many places even though the site functionality has been ported to Mastodon; the principle of nested threading and contexts applies in much the same fashion.
Thanks go to Terence Eden's article on this very topic which helped with the building of ThreadTree, and mapgie who (re-)introduced me to Twitter and caused this whole mess.
]]>Fifteen years later, this build was starting to really show its age: built in an era before mobile viewports were common, it would always render as a desktop site, and it had various vestiges of Internet Explorer compatibility, such as this excellent hack that was common at the time to allow for transparent PNG support.
#wrapper #foot { filter: progid:DXImageTransform.Microsoft.AlphaImageLoader( sizingMethod=crop, src='../img/foot.png' ); } #wrapper > #foot { background: url(../img/foot.png) no-repeat top left; }
When the idea of rebuilding this place came about towards the end of 2023, it seemed high time to make use of some of the more modern CSS features that have slowly crept into common usage and good browser support in the intervening span. Let's pick through the stylesheet and see which CSS techniques are being used that weren't at the time of the previous design.
At the top of the new stylesheet, we find:
:root { --g-bg: #e5dacc; --g-text: black; --g-fig: #d4c1aa; ... } body { background: var(--g-bg); color: var(--g-text); }
Defining these variables on the :root
pseudo-class makes them available throughout the stylesheet, meaning we can define colours or other values in one place. This makes switching out and testing of colour themes and palettes much easier; I tried a few palettes before settling on the current values, which was made simple by not having to hunt for usages of the particular colour values to swap them out.
It also makes colour usage more semantic: the page background isn't an arbitrary hex string, it's --g-bg
and that's much more readable when scanning through the rest of the styling. This means that such features as, for example, switching the site to dark mode are made possible simply by switching out the definitions in this block.
Now there's an idea... Let me get back to that another time.
Immediately below the variable definition, we find a couple of webfont definitions for the header (Over the Rainbow) and code blocks (Inconsolata):
@font-face { font-family: 'OverTheRainbow'; font-weight: normal; font-style: normal; src: url(/assets/overtherainbow.ttf) format('truetype'); } @font-face { font-family: 'Inconsolata'; font-weight: normal; font-style: normal; src: url(/assets/inconsolata.ttf) format('truetype'); }
With these fonts defined, their names can be used in any following rule. For example, code blocks have this styling:
main samp pre { font: 16pt Inconsolata, monospace; background: var(--g-fig); }
Webfont definitions support multiple file formats; for simplicity and highest compatibility, I've used the TTF files here.
Now, webfonts had been a thing for some time when the previous site design was put together, but the main thing preventing their use in the design at the time was...
A stylistic choice for the page header was to rotate the text slightly, but back in '08 this wasn't possible in a well-supported way. Modern browser support for the transform
rule is fairly complete, so this is finally useful. The syntax of the rule itself is fairly simple:
h1 { ... text-align: center; transform: rotate(-3deg); }
This also means I don't have to implement the headers of each page as a transparent PNG laid over the header graphic, which significantly reduces page weight and helps not only with load speed, but with accessibility: having the page's main header be an image can be confusing when passed through a screenreader or other accessibility tools, but as text the header's existence is made clear.
The main disadvantage of webfonts is that they need to be loaded in by the browser, and that can only occur after the CSS has been downloaded:
For the period between the CSS being available, and the webfonts loading in, the default behaviour is to blank out the element being rendered for lack of a font to use. You can, however, specify a fallback behaviour which is to use the second font defined against the element (and if that's also a webfont, fall back further until the browser reaches a standard font that's on the machine).
Having these elements visible (in the wrong font) before their fonts load in may seem weird from a design perspective, but it's good for accessibility and page responsiveness: if you can see all the text on the page immediately, a screenreader (or a search engine's bot, for that matter) can too, and it won't have to wait for everything to load.
Fallback behaviour can be configured on the @font-face
:
@font-face { font-family: 'OverTheRainbow'; font-weight: normal; font-style: normal; src: url(/assets/overtherainbow.ttf) format('truetype'); font-display: fallback; } h1 { ... font-family: OverTheRainbow, sans-serif; }
There's one more piece of the puzzle regarding the behaviour of the page heading on this site, and it's how the header responds to resizing and different sizes of viewport. For that, we come to perhaps the best new addition to CSS of all:
With clamp
, one is able to specify a flexible measurement with a range above/below which the measurement won't go. For example, the h1
on this page is defined in full as below:
h1 { font: clamp(24px, 3.5vw, 60px)/1.4 OverTheRainbow, sans-serif; }
Let's break this down:
OverTheRainbow
webfont, with a fallback to the browser's built-in sans-serif;And if you're thinking these figures look a little arbitrary, that's because they are: some experimentation was needed to arrive at values which work sensibly with the design.
Clamping the font size handles responsiveness of the heading text, but the other component that resizes is the scrap-of-paper design, which is an element background. For this, we need...
In the way back when, you could define image backgrounds on elements in CSS, but your options were to have them repeat horizontally/vertically or ...not. Modern CSS gives you two new tools as options to background-size
: there's cover
, which tries to completely cover the element by sizing up the background and cropping the edges, and there's contain
which tries to fit the image into the element in full, perhaps leaving uncovered gaps in either the vertical or the horizontal.
For our use-case, we need contain
because we want the whole image visible:
header { background: url(/assets/headback.webp) no-repeat top left; background-size: contain; width: 100%; min-width: 900px; max-width: 1450px; aspect-ratio: 145/35; max-height: 350px; }
With this set of rules, we're giving the browser significant contraints in how it's to render the header: full-width, but within a particular range, and always maintaining a 145/35 aspect ratio. In addition, when the browser comes to fill in the background, contain
means it will lean towards keeping the full image visible rather than trying to crop any edges.
We've made it as far as defining the header, and already modern CSS has come in very useful for creating a design that was simply not possible in the dark days of the late oughts. As we come down the page, there are more CSS features in use that we won't cover in detail here:
And there are features coming to wider support in CSS that haven't yet been employed here, the most compelling of which is rule nesting. To take a sample from further down the stylesheet:
main samp { font: 16pt Inconsolata, monospace; background: var(--g-fig); display: block; } main samp pre { margin: 0; padding: 1em; overflow: auto; } main samp kbd { color: var(--g-code-keyword); } main samp var { color: var(--g-code-var); } main samp s { color: var(--g-code-comment); text-decoration: none; }
There's a bunch of redundancy in the match specifications here, and one of the more compelling reasons to use CSS pre-processors like Sass has, in the past, been that one can nest rules like this to make for cleaner code to work with:
main samp { font: 16pt Inconsolata, monospace; background: var(--g-fig); display: block; pre { margin: 0; padding: 1em; overflow: auto; } kbd { color: var(--g-code-keyword); } var { color: var(--g-code-var); } s { color: var(--g-code-comment); text-decoration: none; } }
A pre-processor means a build step, which means a build process, and suddenly your plain HTML/CSS site has become a Whole Thing. With CSS nesting coming to browsers, this will soon be natively supported client-side, and will no longer be a reason to have a build process at all.
Overall, I'm happy with how the refresh has come out; here's to another fifteen years. Let's see how CSS advances in the interim, and what's possible (and widely supported) the next time I get bored and decide to rebuild this place.
]]>Explorations of the smallest possible thing in a particular area aren't unprecedented here: the first article on this site is The Smallest Nintendo DS ROM, from 2006. So when I heard about the Binary Golf Grand Prix I was left with no choice but to put in an attempt. Unfortunately, I learned of the existence of BGGP #4 around two hours before it closed for submissions, so I ended up handing in a scrap of PHP. But let's look at what could've been...
This year's problem is specified as follows:
Create the smallest self-replicating file.
A valid submission will:
- Produce exactly 1 copy of itself
- Name the copy "4"
- Not execute the copied file
- Print, return, or display the number 4
The most natural thought that arises, unbidden, is how one would go about this on the Commodore 64. Commodore BASIC has the built-in SAVE
command, so our submission might be as simple as:
10 SAVE "4", 8 20 PRINT 4
As the task here is to generate the smallest file, we should look at how this program is stored in memory (and thus, on disk). Commodore BASIC stores programs almost entirely as written, and the code execution process includes a parsing step which tokenises the program before it's executed. So the above program looks like this if we use VICE's memory monitor:
(C:$e5cf) m 0800 >C:0800 00 0e 08 0a 0094 20 22 34 22 2c 20 38 0016 08 ...... "4", 8... >C:0810 14 0099 20 34 0000 00 00 00 ff ff ff ff 00 00 ... 4.....????..
We can see the program starts at address $0801
hexadecimal: this is the case for every BASIC program on the C64, whether typed in or loaded. Each line starts with a pointer to the next line, allowing the BASIC interpreter to rapidly work through the program if it needs to determine the program length or find a particular line to change its content.
After the pointer to the next line comes the line number: two bytes, in little-endian format (small byte first), which is standard for the 6502-derived processor inside the C64. So we see the first line is numbered $000a
, 10. And after this comes the line content itself: 94 20 22 34 22 2c 20 38 00
.
As a program is typed into the BASIC interpreter, it's tokenised: any keywords in the line get replaced by token values before being stored in memory. We can see in this line that SAVE
has been replaced by command token $94
, such that if this value comes at the start of a command the interpreter knows this is a SAVE
command without needing to read and parse the individual letters of the word "SAVE".
The rest of the command is not tokenised, and responsibility for parsing this arguments string is passed to whichever routine in the BASIC interpreter is handling the command. In this case, the rest of the command is ASCII text: <space>"4",<space>8
; note that the spaces are preserved as they were entered in the program. The arguments are terminated by a zero byte.
The second line is similarly structured: a $99
token representing the PRINT
command, and an arguments string of 20 34
, representing <space>4
.
Now we've seen how the BASIC program is stored internally, we have direction for a round of golf. The first step is to remove the spaces:
10SAVE"4",8 20PRINT4
One might be tempted to remove the quotes around the filename, relying on whatever type coercion may exist in the BASIC interpreter to convert 4
into a string. Unfortunately, this isn't a thing on the C64:
10SAVE4,8 20PRINT4 RUN ?TYPE MISMATCH ERROR IN 10
Commodore BASIC does, however, support multiple commands on one line, with the colon separator. Thus, we can combine the two commands of our program and remove the four bytes associated with the second line pointer and line number:
10SAVE"4",8:PRINT4
This program is stored in memory as below:
(C: $e5cf) m 0800 >C:0800 00 0f 08 0a 0094 22 34 22 2c 383a99 34 0000 ......"4",8:.4..
We see that the two commands have been stored as one line, so there's one next-line pointer (which refers to the byte after the end of the program); it also becomes plain that a zero byte isn't the only thing that can end an arguments string in C64 BASIC. The colon, byte $3a
, can also act as an end-of-arguments delimiter, denoting the end of a command and the start of another.
Our final result in Commodore BASIC is 14 bytes. Is it possible to produce a smaller program by switching to machine language, and communicating with the disk drive through Commodore DOS directly?
One advantage of using such a venerable computer as the C64 for this task is that the documentation is extensive and copious. There are eleven separate commentaries on the SAVE
routine available in the C64 Kernal API reference, including detailed explanation of the internal workings; for our purposes, we'll use the example routine provided by Commodore's own Programmer's Reference, which gives a sequence of operations for using SAVE
:
- Use the SETLFS routine and the SETNAM routine (unless a SAVE with no file name is desired on "a save to the tape recorder"),
- Load two consecutive locations on page 0 with a pointer to the start of your save (in standard 6502 low byte first, high byte next format).
- Load the accumulator with the single byte page zero offset to the pointer.
- Load the X and Y registers with the low byte and high byte re- spectively of the location of the end of the save.
- Call this routine.
In this case, we'll be saving to disk (device 8), so we'll want to call both SETLFS
and SETNAM
before running through the save. There are two things we'll need to note for the save itself:
As mentioned above, when a BASIC program is loaded from disk, it always loads to the same location (the start of BASIC memory, $0801
hexadecimal); a machine-language program doesn't have this restriction, and can be loaded into memory anywhere. To facilitate this, the first two bytes of the program as stored on disk are actually the address to which it should be loaded. Accordingly, when saving the program this must be accounted for by setting up the load address in memory beforehand (in little-endian format, as the Programmer's Reference mentions above).
We also have the extra stipulation that this program needs to be self-replicating: if the file saved by this code is reloaded, it will need to behave the same way. To that end, the load address for the program will need to be written by the program itself, to a location just before the program, and the concatenated block saved together.
BASIC keeps track of how long the entered program is, so it can determine how much memory needs to be saved to disk; our machine-language program will have to set up the amount of data to save without this help being available.
At this point, it's already becoming obvious that our machine-language attempt will be longer than 14 bytes to perform the same task as the BASIC program above, but to quote from a certain classic game show:
I've started, so I'll finish...
FILENAME = $002A; An unused byte in zero pagePROGPTR = $009B; An unused block of two bytesPROGSTART = $C0C0; Program runs from hereMEMSTART = $C0BE; SAVE starts from hereprocessor 6502 org PROGSTART start: lda #1; Logical file 1ldx #8; Disk drive 0 (device 8)ldy #255; No secondary commandjsr $FFBA; SETLFSlda #'4; Our filename: "4"sta FILENAME; Store the namelda #1; It's one byte longldx #<FILENAME; And it's stored in zero pageldy #>FILENAME; at this addressjsr $FFBD; SETNAMlda #$c0; Our load location ($C0C0)sta MEMSTART; Written to the two bytessta MEMSTART + 1; before the program in memorylda #<MEMSTART; Set up our indirect pointersta PROGPTR; from the start of the filelda #>MEMSTART; to our two-byte pointer storesta PROGPTR + 1; in zero pagelda #<PROGPTR; Pointer to the start of fileldx #<end; Actual location of the end of fileldy #>end; (including the two-byte header)jsr $FFD8; SAVElda #4; We need to print or return 4rts; So let's return 4end:
The above machine-language program clocks in at 50 bytes, almost four times as large as the equivalent BASIC routine. There's some scope for reducing that number, but we don't have a hope of reaching the 14 bytes that a higher-level representation affords us. It should also be noted that we're not attempting to print the number 4 as per the task description, as this would require another call out to the Kernal and/or recycling of BASIC routines to do the same.
There is a level below this, of course, which we haven't reached today: direct communication with the disk drive to push the program to disk, bypassing Commodore DOS (at least at the computer end). The Kernal API is an abstraction on top of this, meaning we don't have to worry about the disk drive's serial bus and other vagaries of the implementation.
This little game of code golf has allowed us a look into why higher-level languages exist at all: not only do they allow us to perform tasks such as "save a file to disk" with less code, they also abstract away complications such as setting the filename before writing a file. Perhaps BASIC shouldn't be maligned quite so much.
Thanks go to the Online 6502 Disassembler by Norbert Landsteiner, and the dASM assembler whose name always makes me think it's a disassembler.
]]>The Commodore 64 has, as the name implies, 64kB of static RAM that can be written to at any time; however, its processor only has access to a 64kB address space, and somehow needs to fit the BASIC and Kernal DOS ROMs, as well as peripheral access, into that same space. Commodore achieved this by overloading the processor's onboard I/O port so you can "switch in" the ROMs and peripheral space, and switch them out if you need access to all the RAM.
For example, the Kernal ROM maps into memory at $E000
, but only if it's been enabled at the CPU port level by switching on bit 1 (HIRAM
). In the emulator's memory controller, this particular mapping might read as follows in JavaScript.
const CPUPORT = 0x0001; const HIRAM = 2; export const readByte = (addr) => { switch (addr & 0xF000) {// RAM, BASIC, peripheral areas...// Kernal ROMcase 0xE000: case 0xF000: if (memory[CPUPORT] & HIRAM) { return ROM.kernal[addr & 0x1FFF]; } else { return memory[addr]; } break; } };
If we're to translate this to WAT, we'll first convert the switch
statement to a function table:
(memory (import "mem") 2) (table 16 anyfunc) (elem (i32.const 0);; RAM, BASIC, peripheral areas ;; ...$read_kernal $read_kernal ) (type $readfunc (func (param i32) (result i32))) (func $read (param $addr i32) (result i32);; table[(addr & 0xF000) >> 12]()(call_indirect (type $readfunc) (get_local $addr) (i32.shr_u (i32.and (get_local $addr) (i32.const 0xF000)) (i32.const 12) ) ) )
Now, this isn't terrible so far. We have two contiguous 64k-value blocks of memory (one for RAM, one for the mapped ROMs), and there are some magic constants in the $read
main handler, but they make sense in the context of needing to extract four bits from the address and using those to index the function table. Where the constants start to make less sense is in $read_kernal
:
(func $read_kernal (param $addr i32) (result i32) (i32.load8_u (if (result i32) (i32.and (i32.load (i32.const 0x0001)) (i32.const 2)) (then (i32.add (i32.and (get_local $addr) (i32.const 0xFFFF)) (i32.const 0x10000) ) ) (else (i32.and (get_local $addr) (i32.const 0xFFFF)) ) ) ) )
This is more inscrutable, especially if (as was the case for me) this code has been written and then left to marinate for a year before coming back to attempt to read it again. It would be much more readable if the above function could instead use defined constants:
(defineCPUPORT(i32.const 0x0001)) (defineHIRAM(i32.const 2)) (defineROM_MEMORY_START(i32. 0x10000)) (func $read_kernal (param $addr i32) (result i32) (i32.load8_u (if (result i32) (i32.and (i32.loadCPUPORT)HIRAM) (then (i32.add (i32.and (get_local $addr) (i32.const 0xFFFF))ROM_MEMORY_START) ) (else (i32.and (get_local $addr) (i32.const 0xFFFF)) ) ) ) )
The above code makes use of a define
keyword that doesn't exist in WASM, so we need something akin to the C preprocessor to parse through the WAT file picking up definitions and replacing their occurrences. Fortunately, WAT was designed to be a format that's quick and easy to handle programmatically (while still being at least halfway usable by human standards): a WebAssembly Text file is one big Lisp-style S-expression, and each element within the file is itself a nested S-expression.
This means we can use S-expression handling libraries to quickly move from the WAT file to an internal representation that can be worked with. One such library is sexpdata by Joshua Boyd, which is an S-expression parser and dumper for Python. If we point our fledgling MMU file at sexpdata.load
, something usable starts to fall out:
import sys from sexpdata import load def main(): filename = sys.argv[1] print(load(open(filename))) if __name__ == "__main__": main()
[Symbol('module'), [Symbol('define'), Symbol('CPUPORT'), [Symbol('i32.const'), Symbol('0x0001')]], [Symbol('define'), ...
Arrays with Symbol
objects (which are an internal construct to sexpdata
) and other arrays inside. We can work with this by using a recursive function to handle the array representing the whole file: if an array is found inside, it will need to be processed recursively, unless it's an array that holds a define
statement.
Definitions can be made at any level of the WAT file, so any extracted definitions will need to be passed down to deeper levels of recursion, and merged with any definitions that are passed in from higher levels. So that means the list of definitions will need to be a parameter to the preprocessor:
import sys from sexpdata import load, dumps def process(sexp, defines): new_defines = {}# First pass: Find immediate descendent s-exp's which define macros# TODO# Macros defined at this level override any of the same name higher up# NOTE: We don't want to mutate the parent's dict, we want our own copymerged_defines = defines.copy() | new_defines# Second pass: Replace at this level, recurse if deeper levels encountered# TODOreturn [i for i in sexp] def main(): filename = sys.argv[1]# Top-level process, with no existing definitions in the dictprint(dumps(process(load(open(filename)), {}))) if __name__ == "__main__": main()
We come to the crux of the problem: filling in the TODOs above. In the first pass, this is fairly simple: we'd like to find any arrays where the first element is a Symbol('define')
, and pull out the associated definition. Let's take the first line of debug output again.
[Symbol('module'), [Symbol('define'), Symbol('CPUPORT'), [Symbol('i32.const'), Symbol('0x0001')]], [Symbol('define'), Symbol('HIRAM'), [Symbol('i32.const'), 2]], ...
We see that some elements in the file have been parsed as scalar values, and some as Symbol's. As Python doesn't have a native is-array
, we can't directly find arrays in order to perform the first-element check mentioned above; we can, however, use numpy's isscalar
to detect scalar values, and isinstance
for Symbol's. From here, it's fairly simple to detect and extract definition clauses.
from numpy import isscalar from sexpdata import Symbol def is_scalar_or_symbol(i): return isscalar(i) or isinstance(i, Symbol) def process(sexp, defines): new_defines = {} for i in sexp: if not is_scalar_or_symbol(i): if i[0] == Symbol('define'): new_defines[i[1]] = i[2] ...
And once we have the definitions to be used on this level, replacement is fairly simple. There are four types of item that will need action:
Symbol('define')
: Exclude;Symbol('define')
: Recurse;return [ merged_defines[i] if ( is_scalar_or_symbol(i) and merged_defines.get(i) )# Replaceelse i if is_scalar_or_symbol(i)# Copyelse process(i, merged_defines)# Recursefor i in sexp if is_scalar_or_symbol(i) or i[0] != Symbol('define')# Exclude]
Putting everything together, the following GitHub Gist contains the final preprocessor script, as well as a sample input WAT file and the generated output:
https://gist.github.com/Two9A/427985064d360342caaf4f7d5769aeef
The astute observer will already have noticed that circular definitions are not handled at all well by this script: if one define
contains a keyword that's handled by another define
, the final output is dependent on the order of definition. In addition, this replicates the behaviour of C's #define
but it doesn't help with any of the other C-style preprocessor directives, notably #include
; support for these is a matter for future expansion.
The initial probe had been dubbed Hyper One, the first successful test of a hyperspace tunneling engine: launched on a Paludis III twelve years ago, it had made its ponderous way to the Lagrange point sixty degrees ahead of the Earth in its orbit, taking a few months to get to its testing position. When the drive was spun up, it flicked across to somewhere in the vicinity of L4, sixty degrees behind Earth, in half a second. The tunneling engine was a miracle of high-speed space travel, but Hyper One and two subsequent probes had only carried bacterial or other small samples.
Hyper Four would be the first test of a hyperspace tunneling engine with humans on board. The engine had been retrofitted to one of SpaceX's old service capsules: with their Mars cycler fully up and running as of a few years ago, their stocks of Dragon capsules were simply taking up space and any use for them was encouraged. Life support and environmental controls were still working perfectly in this particular capsule: in its previous job as fifth (and seventh) crew service mission to the Space Station, it had encountered no issues except for a harder-than-usual ocean landing on the second use. That made it a fairly cheap capsule for the Hyper team to pick up.
James Kent, veteran of the Hyper series, had been voluntold as the crew member for this fourth mission: Dragon had made its way out under its own steam, and was now parked in orbit around the Earth-Moon system's L2 point: that helped to simplify the calculations by removing nearby gravitational influences, while still being visible from Earth so the Hyper team could track Four from the ground.
James thought back to the first test of Hyper One, and the astonishment they'd all felt when One had jumped more than 0.2 AU in the blink of an eye. And now he'd be doing the same, looking out the portholes of a capsule at the space below (or was it above?) realspace. To say he was excited was probably understating it.
The radio crackled into life.
"Hyper Four, this is Houston again. Our board is green for spin-up; we're sending you updated hyper-coordinates for the tunneling engine. Please confirm."
"Programmed in," James stated. "Let's do this, over."
"Copy, Hyper Four. Engage when ready."
James flipped a switch over his head, and the carrier signal from Earth immediately redshifted into oblivion. Light flooded in through the capsule's portholes.
Nothing else seemed to happen for a few seconds. Time passed on board Hyper Four, the clock by James's hand ticked by at its normal rate, but the light from outside remained unchanging and constant. Then the radio cracked into life again.
License and registration, please.
"Er, Houston... do you read?"
James didn't see how he could be receiving radio from above (or was it below) the skein of realspace, since the light waves wouldn't make it outside the constraints of untunneled space...
Come on, you can't pop up in the middle of the B6631 and cause disruption to traffic, then plead ignorance. You've been pulled over to the shoulder; your license, please.
"Alright, if this is Guinea base, Funny Joke, guys. I'll have to ask you how you're tunneling radio into the hyperskein when I get back; over."
We're talking at cross-purposes here. Do you have a license, sir?
"I guess I'll play along. No, I guess?"
Oh, you're one of those. Right, well, you may regard your patch of space as sovereign, but if you make use of the federal galactic highways, you'll need to abide by federal law. As you don't have your license, I'm issuing you a summons to Sirius district court: you'll need to appear in-
"Wait, wait, hold on. What do you mean, federal highways? We only just connected to hyperspace, I didn't know that-"
Oh, you're one of those. Right, well. Er, let's see... Firstly, it's infraspace, the interbrane region below realspace, that you're in.
"Clears that one up, I guess..."
Secondly, your tunnel intersected the B6631 galactic highway and has left a hole in lane four. And not a smooth hole, either; your engine is terribly noisy in its hookup to the interbrane.
"Er, right. I wasn't aware that the B6631 ...ran through here. It's not like we have a map..."
Right, yeah, first-contacters, sure. I could've sworn there was a protocol for this... In light of your status, I've rescinded the summons to Sirius-f. Let me go talk to someone, I'll be back.
James was left alone for a few minutes. Or at least, a few minutes passed on board the capsule; who knows what was happening in realspace. This... highways officer? that James had talked with, showed up on the radio again.
Alright, you're free to go. As I mentioned, you've been pulled over to the shoulder of the B6631, and my superiors have authorized me to write a map of the highway network to your support equipment's silicate substrate. Your nearest on-ramp is the core of the planet you refer to as Jupiter; please try not to drill any more holes in our roads.
"Well, thank you. This is... amazing. I just have a couple of questi-"
"How do you speak my language" comes up surprisingly often with first-contacters, my superiors mentioned. Consider that your vehicle is built in realspace, and you're in the interbrane at the present time: all atomic connections are visible to those who can travel directly in the interbrane.
"Right, yes... I guess my only other question is, how do we get to the core of Jupiter to join the highway?"
Surface roads aren't our concern, sir. Your local onramp was constructed some time ago.
"Ok, well, er. Thank you..."
One more thing before I eject you into realspace, sir. I've also written our current emissions regulations into your vehicle's silicate; a vehicle with such noisy tunneling output as yours runs the risk of being impounded, and that's a court appearance that can't be rescinded. Have a good journey.
And the light pouring into the capsule flicked off. The capsule's console screen started spewing text:
Determining position 12 degrees above the orbital plane; reorienting comms Carrier signal obtained Time elapsed since signal loss: 0.48 seconds
"Er, Houston, Hyper Four. Boy, do I have a tale for you." ]]>
dmsetup table --showkeys
, and the LUKS header with key material from cryptsetup luksHeaderBackup
.)Any sane person would write the data off as lost, but I decided to hold out a sliver of hope that it was only one disk that had gone bad, and that I'd just failed to arrange things properly. I didn't want to futz with the disks any further, so any more work on them would have to wait until a disk large enough to hold all the images was available.
Fast forward to May 2015, and the release of that crazy SMR 8TB from Seagate. I ran out and pre-ordered one, seeing my chance, then set to dd
'ing the RAID5 member partitions over to the 8TB disk. (Yes, I should've used ddrescue, but I got lucky, and the disks were physically fine.) Then I wrote a script and some permutations to run over the images of disks 1, 2 and 3 (in the order I'd left them in) trying four different RAID5 layouts, four different chunk sizes and 24 permutations. It should be noted that I deliberately assembled the RAID from 3 out of 4 disks, to prevent a rebuild overwriting anything.
That script ran on disk combinations 1/2/3, 1/3/4 and 2/3/4, and generated 384 LUKS header backups. Then I ran cryptsetup luksAddKey
against each of those backups, using a file containing 24 possible variations on the passphrase I'd used to set up the encryption. So that's 9,216 attempts, most of which came back with "No key available with this passphrase". But one attempt out of all of those looked different:
luks.g413.ls.512: No key available with this passphrase.
luks.g413.ls.64: No key available with this passphrase.
luks.g431.la.128: No key available with this passphrase.
luks.g431.la.256: No key available with this passphrase.
luks.g431.la.512: No key available with this passphrase.
luks.g431.la.64: No key available with this passphrase.
luks.g431.ls.128: No key available with this passphrase.
luks.g431.ls.256: No key available with this passphrase.
luks.g431.ls.512: No key available with this passphrase.
luks.g431.ls.64:
Trying passphrase: In a hole in the ground, there lived a hobbit!
luks.1g23.ra.128: No key available with this passphrase.
luks.1g23.ra.256: No key available with this passphrase.
luks.1g23.ra.512: No key available with this passphrase.
luks.1g23.ra.64: No key available with this passphrase.
The degraded RAID set that didn't contain disk 2 was the right set. It turns out that the order I left the disks in all those years ago was 2/4/3/1, and the layout and chunk size were the defaults. Amazingly, the only thing that had been corrupted was the LUKS header on disk "2", and all the encrypted blocks were fine.
Earlier today, I bought a second 8TB disk to copy all the data off; the first 8TB will be repurposed as its mirror. Ten minutes ago, I finished the song that was so rudely aborted by the Mayan apocalypse.
Lesson learned: if you're going to encrypt your disks, keep a backup of the master key somewhere.
]]>Independent Submission
Request for Comments: 7168
Updates: 2324
Category: Informational
ISSN: 2070-1721
The Hyper Text Coffee Pot Control Protocol (HTCPCP) specification does not allow for the brewing of tea, in all its variety and complexity. This paper outlines an extension to HTCPCP to allow for pots to provide networked tea-brewing facilities.
This document is not an Internet Standards Track specification; it is published for informational purposes.
This is a contribution to the RFC Series, independently of any other RFC stream. The RFC Editor has chosen to publish this document at its discretion and makes no statement about its value for implementation or deployment. Documents approved for publication by the RFC Editor are not a candidate for any level of Internet Standard; see Section 2 of RFC 5741.
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc7168.
Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
As noted in the Hyper Text Coffee Pot Control Protocol, coffee is renowned worldwide as an artfully brewed caffeinated beverage, but coffee shares this quality with many other varied preparations based on the filtration of plant material. Foremost, among these are the category of brews based on the straining of water through prepared leaves from a tea tree: the lineage and history of the tea genus will not be recounted as part of this paper, but evidence shows that the production of tea existed many thousands of years ago.
The deficiency of HTCPCP in addressing the networked production of such a venerable beverage as tea is noteworthy: indeed, the only provision given for networked teapots is that they not respond to requests for the production of coffee, which, while eminently reasonable, does not allow for communication with the teapot for its intended purpose.
This paper specifies an extension to HTCPCP to allow communication with networked tea production devices and teapots. The additions to the protocol specified herein permit the requests and responses necessary to control all devices capable of making, arguably, the most popular caffeinated hot beverage.
The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in RFC 2119.
The TEA extension to HTCPCP adapts the operation of certain HTCPCP methods.
BREW
and POST
MethodsControl of a TEA-capable pot is performed, as described in the base
HTCPCP specification, through the sending of BREW
requests. POST
requests are treated equivalently, but they remain deprecated. Tea
production differs from coffee, however, in that a choice of teas is
often provided for client selection before the tea is brewed. To
this end, a TEA-capable pot that receives a BREW
message of content
type message/teapot
MUST respond in accordance with the URI
requested, as below.
/
URIFor the URI /
, brewing will not commence. Instead, an Alternates
header as defined in RFC 2295 MUST be sent, with the
available tea bags and/or leaf varieties as entries. An example of
such a response is as follows:
The following example demonstrates the possibility of interoperability of a TEA-capable pot that also complies with the base HTCPCP specification:
TEA-capable HTCPCP clients MUST check the contents of the Alternates
header returned by a BREW
request, and provide a specific URI for
subsequent requests of the message/teapot
type.
A request to the /
URI with a Content-Type
header of
message/coffeepot
SHOULD also be responded to with an Alternates
header in the above format, to allow TEA-capable clients the
opportunity to present the selection of teas to the user if inferior
caffeinated beverages have initially been requested.
TEA-capable pots follow the base HTCPCP specification when presented
with a BREW
request for a specific variety of tea. Pots SHOULD
follow the recommendations for brewing strength given by each
variety, and stop brewing when this strength is reached; it is
suggested that the strength be measured by detection of the opacity
of the beverage currently under brew by the pot.
TEA-capable clients SHOULD indicate the end of brewing by sending a
BREW
request with an entity body containing stop
; the pot MAY
continue brewing beyond the recommended strength until this is
received. If the stop
request is not sent by the client, this may
result in a state inversion in the proportion of tea to water in the
brewing pot, which may be reported by some pots as a negative
strength.
If a BREW
command with an entity body containing stop
is received
before the recommended strength is achieved, the pot MUST abort
brewing and serve the resultant beverage at lesser strength. Finding
the preferred strength of beverage when using this override is a
function of the time between the TEA-capable pot receiving a start
request and the subsequent stop
. Clients SHOULD be prepared to
make multiple attempts to reach the preferred strength.
HTCPCP-TEA modifies the definition of one header field from the base HTCPCP specification.
Accept-Additions
Header FieldIt has been observed that some users of blended teas have an
occasional preference for teas brewed as an emulsion of cane sugar
with hints of water. To allow for this circumstance, the Accept-Additions
header field defined in the base HTCPCP specification is
updated to allow the following options:
Implementers should be aware that excessive use of the Sugar
addition
may cause the BREW
request to exceed the segment size allowed by the
transport layer, causing fragmentation and a delay in brewing.
HTCPCP-TEA makes use of normal HTTP error codes and those defined in the base HTCPCP specification.
A BREW
request to the /
URI, as defined in Section 2.1.1, will
return an Alternates
header indicating the URIs of the available
varieties of tea to brew. It is RECOMMENDED that this response be
served with a status code of 300, to indicate that brewing has not
commenced and further options must be chosen by the client.
Services that implement the Accept-Additions
header field MAY return
a 403 status code for a BREW
request of a given variety of tea, if
the service deems the combination of additions requested to be
contrary to the sensibilities of a consensus of drinkers regarding
the variety in question.
A method of garnering and collating consensus indicators of the most viable combinations of additions for each variety to be served is outside the scope of this document.
TEA-capable pots that are not provisioned to brew coffee may return either a status code of 503, indicating temporary unavailability of coffee, or a code of 418 as defined in the base HTCPCP specification to denote a more permanent indication that the pot is a teapot.
message/teapot
Media TypeTo distinguish messages destined for TEA-capable HTCPCP services from
pots compliant with the base HTCPCP specification, a new MIME media
type is defined by this document. The Content-Type
header of a POST
or BREW
request sent to a TEA-capable pot MUST be message/teapot
if
tea is to be requested.
As noted in Section 2.1, a BREW
request with a Content-Type
header
field of message/teapot
to a TEA-capable pot will result in an
Alternates
header being sent with the response, and a pot will not be
brewed. However, if the BREW
request has a Content-Type
of
message/coffeepot
, and the pot is capable of brewing coffee, the
service's behavior will fall back to the base HTCPCP specification
and a pot will be brewed.
If the entity returned by the server when brewing commences contains
a TEA-compliant Alternates
header indicating message/coffeepot
and
the client does not want coffee, the client SHOULD then send a BREW
request with an entity body containing stop
. This will result in
wasted coffee; whether this is regarded as a bad thing is user-
defined.
Such waste can be prevented by TEA-capable clients, by first
requesting a BREW
of type message/teapot
and then allowing
selection of an available beverage.
As with the base HTCPCP specification, most TEA-capable pots are expected to heat water through the use of electric elements, and as such will not be in proximity to fire. Therefore, no firewalls are necessary for communication with these pots to proceed.
This extension does support communication with fired pots, however, which may require heat retention and control policies. Care should be taken so that coal-fired pots and electrically heated kettles are not connected to the same network, to prevent pots from referring to any kettles on the network as darkened or otherwise smoke driven.
This extension to the HTCPCP specification would not be possible without the base specification, and research on networked beverage production leading up thereto. In that vein, the author wishes to acknowledge the sterling work of Larry Masinter in the development of the leading protocol for coffee pot communication.
Many thanks also to Kevin Waterson and Pete Davis, for providing guidance and suggestions during the drafting of this document.
]]>Download the code: http://imrannazar.com/content/files/jpegparse.zip
Previously, I discussed the Huffman compression algorithm as implemented by JPEG, and a mechanism by which JPEG encoders pick the substitution values used for the compression. The image itself, in a baseline entropy-coded JPEG file, is stored in one "scan" as a stream of Huffman codes; the codes are of a variable length, and are not necessarily at even byte boundaries.
For this reason, it's a requirement that some kind of queue be introduced between the bytes of the file, and the values as seen by the JPEG decoder: this allows bits to be pushed onto the queue from the bytes of the file, and pulled out in varying amounts by the decoder. A single routine can thus be made the central routing point for any requests for variable-width symbols.
A queue of things, when used in this fashion, generally has two modes of operation: either there are enough things in the queue for the request to deal with, or additional reading has to be done before there are enough things in the queue to handle the request. In the case of a bit-level queue for reading from a file, as we need here, the number of bits requested is the determining factor. In the first example below, a queue holding five bits is asked to return the first three, and is able to handle the request without a problem.
If a request for four bits then comes in, the queue doesn't contain enough bits to handle the request, and must first fetch another full byte from the JPEG file (shown here in red).
A complication arises when trying to code an implementation of the bit queue as shown above: the output end of the queue is at the left, and if one contiguous value is used to store the queue, the left is the most-significant end with the highest-value bits. If we demarcate the queue as being 32 positions long, and define one 32-bit value as holding its entire contents, the above scenarios break down as follows:
At all times during the queue's operation, the next available bits are at the high-value end of the contiguous queue variable. An implementation of this may look like the following.
class JPEG { private:// Read a number of bits from fileu16 readBits(int); };
u16 JPEG::readBits(int len) {// The number of bits left in the queue, and the value representedstatic int queueLen = 0; static u32 queueVal = 0; u8 readByte; u16 output; if (len > queueLen) { do {// Read a byte in, shift it up to join the queuereadByte = fgetc(fp); queueVal = queueVal | (readByte << (24 - queueLen)); queueLen += 8; } while (len > queueLen); }// Shift the requested number of bytes down to the other endoutput = ((queueVal >> (32 - len)) & ((1 << len) - 1)); queueLen -= len; queueVal <<= len; return output; }
As mentioned in the previous part of this series, a JPEG file can define up to 32 Huffman code tables, each in their own DHT
segment. A JPEG file holds the data corresponding to the image itself in a "frame", denoted by a "Start of Frame" segment header. The SOF header contains a part of the information required to decode the frame, and is structured according to the following table.
Field | Value | Size (bytes) |
---|---|---|
Precision (the number of pixels in a JPEG block) | 8 | 1 |
Image height | Up to 65535 | 2 |
Image width | Up to 65535 | 2 |
Components | Number of colour components | 1 |
For each component (in a YUV-colour file, three) | ||
ID | Identifier for later use | 1 |
Sampling resolution | For later examination | 1 |
Quantisation table | For later examination | 1 |
As can be seen in the above table, some of the fields involve operations that we have not yet examined (the sampling resolution and the quantisation table for each component). For the purposes of completing the SOF segment handler, we can hold onto this information for later use.
A "frame" consists of the SOF header, and a number of "scans"; as the name implies, each scan is a pass over the full rectangle of the image. In an interlaced JPEG file for example, multiple scans would be present for the single frame, each of them having more resolution than the last; in a progressive JPEG file, there is just one scan containing all the information for the image. Since this series is concerned with building a decoder for progressive JPEG files, we'll focus on dealing with a single scan in the frame.
It turns out that the information required by Huffman decoding, in particular which of the DHT
tables to use, is defined by each scan in the frame instead of by the frame itself. We'll look at the scan-level information in more detail in the next part of this series; for now, it's sufficient to decouple our representation of a colour component in the image from the SOF
data, and define the exact metadata for a component later.
With the information detailing the structure of an SOF
header, it becomes relatively simple to build a segment parser to plug into our existing code. The only complication arises from the fact that multi-byte values in JPEG are stored in big-endian format, which may not necessarily be the host format for large integers. It's useful to have a set of definitions for transparently handling big-endian values, which is presented below.
/** * Let's Build a JPEG Decoder * Big-endian value handling macros * Imran Nazar, May 2013 */#ifndef __BYTESWAP_H_ #define __BYTESWAP_H_ #if __SYS_BIG_ENDIAN == 1 # define htoms(x) (x) # define htoml(x) (x) # define mtohs(x) (x) # define mtohl(x) (x) #else # define htoms(x) (((x)>>8)|((x)<<8)) # define htoml(x) (((x)<<24)|(((x)&0xFF00)<<8)|(((x)&0xFF0000)>>8)|((x)>>24)) # define mtohs(x) (((x)>>8)|((x)<<8)) # define mtohl(x) (((x)<<24)|(((x)&0xFF00)<<8)|(((x)&0xFF0000)>>8)|((x)>>24)) #endif #endif//__BYTESWAP_H_
// Prevent padding bytes from creeping into structures#define PACKED __attribute__((packed)) class JPEG { private:// Information in the SOF headerstruct PACKED { u8 precision; u16 height; u16 width; u8 component_count; } sofHead; typedef struct PACKED { u8 id; u8 sampling; u8 q_table; } sofComponent;// Internal information about a colour componenttypedef struct PACKED { u8 id;// There is likely to be more data here...} Component;// The set of colour components in the imagestd::vectorcomponents; // The SOF segment handlerint SOF(); };
int JPEG::parseSeg() { ... switch (id) {// The SOF segment defines the components and resolution // of the JPEG frame for a baseline Huffman-coded imagecase 0xFFC0: size = READ_WORD() - 2; if (SOF() != size) { printf("Unexpected end of SOF segment\n"); return JPEG_SEG_ERR; } break; ... } return JPEG_SEG_OK; }
int JPEG::SOF() { int ctr = 0, i; fread(&sofHead, sizeof(sofHead), 1, fp); ctr += sizeof(sofHead); sofHead.width = mtohs(sofHead.width); sofHead.height = mtohs(sofHead.height); printf("Image resolution: %dx%d\n", sofHead.width, sofHead.height); for (i = 0; i < sofHead.component_count; i++) { sofComponent s; fread(&s, sizeof(sofComponent), 1, fp); ctr += sizeof(sofComponent); Component c; c.id = s.id; components.push_back(c); } return ctr; }
As mentioned above, the image frame in a progressive JPEG file is encoded as a scan, composed of a series of blocks; depending on the sampling resolution of the components in the image, these blocks can be larger than the 8x8-pixel base block size of the JPEG algorithm. In the next part of this series, I'll examine the relationship between these larger units and the colour components of the image.
Imran Nazar <tf@imrannazar.com>, May 2013.
]]>Download the code: http://imrannazar.com/content/files/jpegparse.zip
In the previous part, I mentioned that most JPEG files employ an encoding technique on top of the image compression, in an attempt to remove any trace of redundant information from the image. The technique used by the most common JPEG encoding is an adaptation of one seen throughout the world of data compression, known as Huffman coding, so it's useful to explore in detail the structure and implementation of a Huffman decoder.
Because Huffman coding is the last thing performed by a JPEG encoder when saving an image file, it needs to be the first thing done by our decoder. This can be achieved in two ways:
This article will take the second approach, to save memory and sacrifice time; full reconstitution can be implemented using the code built below in a very similar fashion.
The concept behind Huffman coding and other entropy-based schemes is similar to the concept behind the substitution cipher: each unique character in an input is transformed into a unique output character. The simplest example is the Caesar substitution, which can be represented in tabular form as follows:
A => D B => E C => F ... Y => B Z => C This is an example of a Caesar cipher Wklv lv dq hadpsoh ri d Fdhvdu flskhu
An improvement on the standard substitution cipher can be made by noting the relative frequency of characters in the input, and designing a table that contains shorter codes as substitutes for these characters, than for rarer ones. Taking a look at the frequency of letters in the above example, with their ASCII representations included, we can produce a table of increasing unique codes such as the following:
Character | ASCII | Frequency | Code |
---|---|---|---|
Space | 00100000 | 7 | 00 |
a | 01100001 | 5 | 01 |
e | 01100101 | 4 | 100 |
i | 01101001 | 3 | 1010 |
s | 01110011 | 3 | 1011 |
h | 01101000 | 2 | 11000 |
p | 01110000 | 2 | 11001 |
r | 01110010 | 2 | 11010 |
C | 01000011 | 1 | 110110 |
T | 01010100 | 1 | 110111 |
c | 01100011 | 1 | 111000 |
f | 01100110 | 1 | 111001 |
l | 01101100 | 1 | 111010 |
m | 01101101 | 1 | 111011 |
n | 01101110 | 1 | 111100 |
o | 01101111 | 1 | 111101 |
x | 01110111 | 1 | 111110 |
Substituting these codes for the characters in the original text, it can be seen how the encoded data is much smaller than the original.
This is an example of a Caesar cipher 01010100 01101000 01101001 01110011 00100000 01101001 01110011 00100000 01100001 01101110 00100000 01100101 01110111 01100001 01101101 01110000 01101100 01100101 00100000 01101111 01100110 00100000 01100001 00100000 01000011 01100001 01100101 01110011 01100001 01110010 00100000 01100011 01101001 01110000 01101000 01100101 01110010 110111 11000 1010 1011 00 1010 1011 00 01 111100 00 100 111110 01 111011 11011 111010 100 00 111101 111001 00 01 00 110110 01 100 1011 01 11010 00 111000 1010 11001 11000 100 11010 54 68 69 73 20 69 73 20 61 6E 20 65 77 61 6D 70 6C 65 20 6F 66 20 61 20 43 61 65 73 61 72 20 63 69 70 68 65 72 DF 15 65 58 F8 4F 9E F7 D4 3D E4 4D 99 6E 8E 2B 38 9A
The main disadvantage of the Huffman coding method is that the table of codes needs to be stored alongside the compressed data: in the above example, the red string of encoded bytes would be meaningless without the corresponding frequency table. The table of codes and their corresponding characters can be recorded in full, but there is a more space-efficient way to save the codes, if attention is paid to the pattern of their occurrence. Two things are of note here: firstly that the codes increase in length, but also that within a group of the same length, codes are sequential. This means the code table can be written down as:
2 codes of length two , starting at 00 1 code of length three, starting at 100 2 codes of length four , starting at 1010 3 codes of length five , starting at 11000 9 codes of length six , starting at 110110
A careful eye on the codes themselves can yield further improvements on how much space it takes to record the encoding table. If we take a look at the codes in conjunction with the list of code length above, we can start counting as follows.
00 (zero) 01 (one)Next code would be 10 (two)100 (four)Next code would be 101 (five)1010 (ten) 1011 (eleven)Next code would be 1100 (twelve)11000 (twenty four) 11001 (twenty five) 11010 (twenty six)Next code would be 11011 (twenty seven)110110 (fifty four) 110111 (fifty five) 111000 (fifty six) 111001 (fifty seven) 111010 (fifty eight) 111011 (fifty nine) 111100 (sixty) 111101 (sixty one) 111110 (sixty two)
In every case, when the requisite number of codes has been counted for the given code length, all that is needed is to double the counter and continue for the next code length. In other words, there is no need to record the "starting at" part of the code lengths list above, since it can be inferred by starting at zero. The final code list therefore looks as follows.
2 codes of length two 1 code of length three 2 codes of length four 3 codes of length five 9 codes of length six The above codes correspond to the following characters, in this order: Space,a,e,i,s,h,p,r,C,T,c,f,l,m,n,o,x
A JPEG file's Huffman tables are recorded in precisely this manner: a list of how many codes are present of a given length (between 1 and 16), followed by the meanings of the codes in order. This information is held in the file's "Define Huffman Table" (DHT) segments, of which there can be up to 32, according to the JPEG standard.
As seen above, data encoded by the Huffman algorithm ends up recorded as a series of codes wedged together in a bit-stream; this also applies to the image scan in a JPEG file. A simple routine for reading codes from the bit stream may look like this:
Code = 0 Length = 0 Found = False Do Code = Code << 1 Code = Code | (The next bit in the stream) Length = Length + 1 If ((Length, Code) is in the Huffman list) Then Found = True End If While Found = False
In order to facilitate this algorithm, the Huffman codes should be stored in a way that allows us to determine if a code is in the map at a given length. The canonical way to represent a Huffman code list is as a binary tree, where the sequence of branches defines the code and the depth of the tree tells us how long the code is. The C++ STL abstracts this out for us, into the map
construct.
Since there are up to 32 possible Huffman tables that can be defined in a JPEG file, our implementation will require 32 map
s to be available. It's also worth defining at this point how the DHT segment handler will be called by the parseSeg
method developed in the previous part of this series.
class JPEG { private:// Defines a tuple of length and code, for use in the Huffman mapstypedef std::pair<int, u16> huffKey;// The array of Huffman maps: (length, code) -> valuestd::map<huffKey, u8> huffData[32];// DHT segment handlerint DHT(); };
int JPEG::parseSeg() { ... switch (id) {// The DHT segment defines a Huffman table. The handler should // read exactly as many bytes from the file as are in the // segment; if not, something's gone wrongcase 0xFFC4: size = READ_WORD() - 2; if (DHT() != size) { printf("Unexpected end of DHT segment\n"); return JPEG_SEG_ERR; } break; ... } return JPEG_SEG_OK; }
int JPEG::DHT() { int i, j;// A counter of how many bytes have been readint ctr = 0;// The incrementing code to be used to build the mapu16 code = 0;// First byte of a DHT segment is the table ID, between 0 and 31u8 table = fgetc(fp); ctr++;// Next sixteen bytes are the counts for each code lengthu8 counts[16]; for (i = 0; i < 16; i++) { counts[i] = fgetc(fp); ctr++; }// Remaining bytes are the data values to be mapped // Build the Huffman map of (length, code) -> valuefor (i = 0; i < 16; i++) { for (j = 0; j < counts[i]; j++) { huffData[table][huffKey(i + 1, code)] = fgetc(fp); code++; ctr++; } code <<= 1; }// Once the map has been built, print it outprintf("Huffman table #%02X:\n", table); std::map<huffKey, u8>::iterator iter; for (iter = huffData[table].begin(); iter != huffData[table].end(); iter++ ) { printf(" %04X at length %d = %02X\n", iter->first.second, iter->first.first, iter->second); } return ctr; }
As with the previous part, the JPEG class can be instantiated with a filename; if this is done, the above code will produce output along the following lines:
Found segment at file position 177: Huffman table Huffman table #00: 0000 at length 2 = 04 0002 at length 3 = 02 0003 at length 3 = 03 0004 at length 3 = 05 0005 at length 3 = 06 0006 at length 3 = 07 000E at length 4 = 01 001E at length 5 = 00 003E at length 6 = 08 007E at length 7 = 09
Once the Huffman maps have been built for a JPEG file, the image scan can be decoded for further processing. In the next part, I'll take a look at the Huffman decoding of the scan, in the wider context of reading blocks from the image and examining the process through which they are transformed.
Imran Nazar <tf@imrannazar.com>, Feb 2013.
]]>Download the code: http://imrannazar.com/content/files/jpegparse.zip
In the previous part, I gave a brief overview of the techniques used by JPEG to compress an image. Before examining the detailed implementation of those techniques, it's useful to look at the overall structure of a JPEG file, for two reasons:
The implementation developed in this series of articles will be written in C++, but the constructs can be transplanted to a language of your choice with little additional complexity.
It should be stated at this juncture that the implementation developed here will only apply to one common subset of all the possible types of JPEG image. Firstly, there are four types of compression supported by the standard:
Further, there are two forms of encoding that are applied on top of the image compression, to further compress the file data:
This series will implement an entropy-coded baseline JPEG decoder.
A JPEG file is made up of segments of varying length, each of which starts with a "marker" to denote which kind of segment it is. There are 254 possible types of segment, but only a few are found in the type of image we'll be decoding:
Name | Short Name | Marker | Description | Length (bytes) |
---|---|---|---|---|
Start of Image | SOI | FF D8 | Delimits the start of the file | 2 |
Define Quantisation Table | DQT | FF DB | Values used by the decoder | 69 |
Define Huffman Table | DHT | FF C4 | Values used by the decompressor | Variable |
Start of Frame | SOF | FF C0 | Information for an entropy-coded baseline frame | 10 |
Start of Scan | SOS | FF DA | Encoded and compressed image bitstream | Variable |
End of Image | EOI | FF D9 | Delimits the end of the file | 2 |
Most of the different types of segment have a "length" value just after the marker, which denotes how long the segment is in bytes (including the length value); this can be used to skip over segments that a decoder doesn't know about. There are three exceptions to this general rule:
For this article, I'll assume the rest of the file is part of a scan if we run into an SOS segment, and skip straight to the EOI.
As a first step, it makes sense to write a program to open a JPEG file, and run through it looking for segment markers. The structure of such a program can be expanded upon with implementation for processing of the different kinds of segments, and the mechanism for skipping over segments given their size can be used later to skip over the parts of the file which are non-essential to the decoding process.
Since the sizes of values in a JPEG file are specified in absolute terms of number of bytes, it's a good idea to abstract the basic integer types into types which refer to size. For this, we'll use a short header file.
#ifndef __INTTYPES_H_ #define __INTTYPES_H_ typedef unsigned char u8; typedef unsigned short u16; typedef unsigned int u32; typedef signed char s8; typedef signed short s16; typedef signed int s32; #endif//__INTTYPES_H_
The above file is set up for 32-bit compilation, but can be adapted if 64- or 16-bit code is required. The advantage of this is that references to integers in the JPEG decoder implementation itself can be agnostic of architecture, and simply refer to u16
and other types defined here.
With these abstractions in place, the implementation of a segment listing is quite simple. Since we'll be building the decoding functionality into a class, it's worth defining the class itself at this time.
#ifndef __JPEG_H_ #define __JPEG_H_ #include "inttypes.h" #include <string> #include <vector> #include <map> #include <stdio.h>// Macro to read a 16-bit word from file#define READ_WORD() ((fgetc(fp) << 8) | fgetc(fp))// Segment parsing error codes#define JPEG_SEG_ERR 0 #define JPEG_SEG_OK 1 #define JPEG_SEG_EOF -1 class JPEG { private:// Names of the possible segmentsstd::string segNames[64];// The file to be read from, opened by constructorFILE *fp;// Segment parsing dispatcherint parseSeg(); public:// Construct a JPEG object given a filenameJPEG(std::string); }; #endif//__JPEG_H_
#include "jpeg.h" #include <stdlib.h> #include <string.h> #include <math.h>//------------------------------------------------------------------------- // Function: Parse JPEG file segment (parseSeg) // Purpose: Retrieves 16-bit block ID from file, shows nameint JPEG::parseSeg() { if (!fp) { printf("File failed to open.\n"); return JPEG_SEG_ERR; } u32 fpos = ftell(fp); u16 id = READ_WORD(), size; if (id < 0xFFC0) { printf("Segment ID expected, not found.\n"); return JPEG_SEG_ERR; } printf( "Found segment at file position %d: %s\n", fpos, segNames[id-0xFFC0].c_str()); switch (id) {// The SOI and EOI segments are the only ones not to have // a length, and are always a fixed two bytes long; do // nothing to advance the file positioncase 0xFFD9: return JPEG_SEG_EOF; case 0xFFD8: break;// An SOS segment has a length determined only by the // length of the bitstream; for now, assume it's the rest // of the file less the two-byte EOI segmentcase 0xFFDA: fseek(fp, -2, SEEK_END); break;// Any other segment has a length specified at its start, // so skip over that many bytes of filedefault: size = READ_WORD(); fseek(fp, size-2, SEEK_CUR); break; } return JPEG_SEG_OK; }//------------------------------------------------------------------------- // Function: Array initialisation (constructor) // Purpose: Fill in arrays used by the decoder, decode a file // Parameters: filename (string) - File to decodeJPEG::JPEG(std::string filename) {// Debug messages used by parseSeg to tell us which segment we're atsegNames[0x00] = std::string("Baseline DCT; Huffman"); segNames[0x01] = std::string("Extended sequential DCT; Huffman"); segNames[0x02] = std::string("Progressive DCT; Huffman"); segNames[0x03] = std::string("Spatial lossless; Huffman"); segNames[0x04] = std::string("Huffman table"); segNames[0x05] = std::string("Differential sequential DCT; Huffman"); segNames[0x06] = std::string("Differential progressive DCT; Huffman"); segNames[0x07] = std::string("Differential spatial; Huffman"); segNames[0x08] = std::string("[Reserved: JPEG extension]"); segNames[0x09] = std::string("Extended sequential DCT; Arithmetic"); segNames[0x0A] = std::string("Progressive DCT; Arithmetic"); segNames[0x0B] = std::string("Spatial lossless; Arithmetic"); segNames[0x0C] = std::string("Arithmetic coding conditioning"); segNames[0x0D] = std::string("Differential sequential DCT; Arithmetic"); segNames[0x0E] = std::string("Differential progressive DCT; Arithmetic"); segNames[0x0F] = std::string("Differential spatial; Arithmetic"); segNames[0x10] = std::string("Restart"); segNames[0x11] = std::string("Restart"); segNames[0x12] = std::string("Restart"); segNames[0x13] = std::string("Restart"); segNames[0x14] = std::string("Restart"); segNames[0x15] = std::string("Restart"); segNames[0x16] = std::string("Restart"); segNames[0x17] = std::string("Restart"); segNames[0x18] = std::string("Start of image"); segNames[0x19] = std::string("End of image"); segNames[0x1A] = std::string("Start of scan"); segNames[0x1B] = std::string("Quantisation table"); segNames[0x1C] = std::string("Number of lines"); segNames[0x1D] = std::string("Restart interval"); segNames[0x1E] = std::string("Hierarchical progression"); segNames[0x1F] = std::string("Expand reference components"); segNames[0x20] = std::string("JFIF header"); segNames[0x21] = std::string("[Reserved: application extension]"); segNames[0x22] = std::string("[Reserved: application extension]"); segNames[0x23] = std::string("[Reserved: application extension]"); segNames[0x24] = std::string("[Reserved: application extension]"); segNames[0x25] = std::string("[Reserved: application extension]"); segNames[0x26] = std::string("[Reserved: application extension]"); segNames[0x27] = std::string("[Reserved: application extension]"); segNames[0x28] = std::string("[Reserved: application extension]"); segNames[0x29] = std::string("[Reserved: application extension]"); segNames[0x2A] = std::string("[Reserved: application extension]"); segNames[0x2B] = std::string("[Reserved: application extension]"); segNames[0x2C] = std::string("[Reserved: application extension]"); segNames[0x2D] = std::string("[Reserved: application extension]"); segNames[0x2E] = std::string("[Reserved: application extension]"); segNames[0x2F] = std::string("[Reserved: application extension]"); segNames[0x30] = std::string("[Reserved: JPEG extension]"); segNames[0x31] = std::string("[Reserved: JPEG extension]"); segNames[0x32] = std::string("[Reserved: JPEG extension]"); segNames[0x33] = std::string("[Reserved: JPEG extension]"); segNames[0x34] = std::string("[Reserved: JPEG extension]"); segNames[0x35] = std::string("[Reserved: JPEG extension]"); segNames[0x36] = std::string("[Reserved: JPEG extension]"); segNames[0x37] = std::string("[Reserved: JPEG extension]"); segNames[0x38] = std::string("[Reserved: JPEG extension]"); segNames[0x39] = std::string("[Reserved: JPEG extension]"); segNames[0x3A] = std::string("[Reserved: JPEG extension]"); segNames[0x3B] = std::string("[Reserved: JPEG extension]"); segNames[0x3C] = std::string("[Reserved: JPEG extension]"); segNames[0x3D] = std::string("[Reserved: JPEG extension]"); segNames[0x3E] = std::string("Comment"); segNames[0x3F] = std::string("[Invalid]");// Open the requested file, keep parsing blocks until we run // out of file, then close it.fp = fopen(filename.c_str(), "rb"); if (fp) { while(parseSeg() == JPEG_SEG_OK); fclose(fp); } else { perror("JPEG"); } }
When constructed with a file, an object of this JPEG class will provide output similar to the following.
Found segment at file position 0: Start of image Found segment at file position 2: JFIF header Found segment at file position 20: Quantisation table Found segment at file position 89: Quantisation table Found segment at file position 158: Baseline DCT; Huffman Found segment at file position 177: Huffman table Found segment at file position 208: Huffman table Found segment at file position 289: Huffman table Found segment at file position 318: Huffman table Found segment at file position 371: Start of scan Found segment at file position 32675: End of image
As can be seen above, the "scan" constitutes the majority of an entropy-coded baseline JPEG; since the entirety of the image data is encoded within the scan, this makes sense. Entropy coding is based on the Huffman compression algorithm, so in the next article I'll examine the parts of a JPEG file that provide the information needed to decode the scan from a bitstream into something usable for further processing.
Imran Nazar <tf@imrannazar.com>, Jan 2013.
]]>Build a JPEG decoder? Whatever for, when we have so many of them already?
JPEG is something we all take for granted: most of the Web comprises pictures transmitted as JPEG files, and video files based on JPEG technology. As it turns out, the concepts that lie behind these images span nearly two hundred years of mathematics and computing theory, and going from the raw file to an image takes a bunch of interesting work.
In An Introduction to Compression, I looked at the difference between "lossless" and "lossy" compression: the difference between the two methods is that lossless compression preserves all the inherent information of the input, whereas lossy compression throws much of it away. Throwing information away only works when it can be deemed unnecessary for proper handling of the file; this would never be the case for a computer program, for example, where every byte is a statement that must be retained.
Images or videos, and their cousins in the audio world, rely on perception to work out what needs to be thrown away: just as the human ear can only distinguish sounds in a particular region of frequencies, the human eye has a particular resolution and any colour changes that happen within a very short distance are essentially invisible. Resolution can also be thought of as "visual frequency", and can be manipulated in much the same way as sound wave frequency or other kinds of wave.
It follows that a chunk of sound or a piece of an image can be compressed by removing those parts of the frequency range that are outside the human experience: those frequency ranges that we don't care about, without which the essence isn't lost. There are three steps to doing that:
The transformation use is derived from the Fourier transform, which takes an integratable mathematical function f(x) and generates an equation f(s) for a frequency spectrum. The Fourier transform only works with continuous ranges of numbers, and extends from negative infinity to positive infinity; this makes it unusable for digital transformations like those we need here. Instead, a discrete version of the Fourier transform is used: the MP3 and JPEG techniques use the discrete cosine transform (DCT) to change a set of data values into an equivalent set of frequencies.
Two examples of the compression process are presented in the following figures: first for a short sound sample.
The below figure represents the same process as above, but applied in two dimensions of space as opposed to one of time. It is generally considered inefficient to transform the entire image into a visual frequency domain at once, so the JPEG algorithm transforms blocks of eight pixels square at a time.
In Figure 2, the right-hand set of images show the DCT and its filtered version. The most important figure in each 8x8 block is the "DC component" in the top-left, which determines the base level for the whole block. Values to the right of this give information as to how often variances happen horizontally, and conversely values below the DC component provide vertical frequency information. It follows that values in the bottom-right of a DCT block describe the highest-fidelity changes in the image, and that filtering consists of drawing a diagonal across each block and retaining the top portion, throwing away the information regarding high-fidelity changes.
JPEG makes use of the fact that the human eye has a maximum resolution when it comes to visual changes. Another feature of the eye is that it is less sensitive to changes in colour than in brightness: the frequency-sensitive "cone" cells of the retina occur at a lower density than the simpler frequency-agnostic "rods", which means a lower visual resolution for colour.
It is possible for images to be further compressed by utilising this information, reducing the amount of information used to encode colour values in relation to brightness. Unfortunately, the traditional additive colour space of red/green/blue used by computer and television displays retains no information about relative brightness and colour saturation; in order to retrieve this information RGB values must be transformed to another colour model.
The JPEG format most commonly uses Y'CbCr colour, where Y' denotes the luminance of a particular pixel, and the Cb and Cr components describe the amount of chrominance on two axes, corresponding to percentages of blue and of red. Transformations from RGB pixel values to Y'CbCr act as a rotation between the cube of all possible RGB values and the cube of possible luminance/chrominance values, as shown in Figure 3.
Once transformed to Y'CbCr, the chrominance channels are separated out and can be manipulated. In this case, "downsampling" is employed: the resolution of the colour channels is halved in both dimensions, such that one "block" of colour information covers the equivalent area of four luminance blocks. In Figure 4, the colour channels have been downsampled by this ratio: it can be seen that the Y' channel is the most important for the integrity of the image, and thus its resolution remains high.
For the remainder of this series of articles, bidirectional downsampling of the type shown here will be assumed: most JPEG images on the Web employ this colour compression, so it's useful to explore. Because of the reduction in resolution, the previously mentioned eight-pixel square used by the JPEG algorithm becomes a minimum unit size of 16 pixels square, with four 8x8 luminance blocks accompanying one block for each axis of colour information.
JPEG files store additional information alongside the encoded image: lookup tables to be used by the decoding process, comments and resolution information. In part two, I'll take a look at the segments that make up a JPEG file, and how to hold onto some of the information provided for use by the decoding implementation of subsequent parts.
Imran Nazar <tf@imrannazar.com>, Jan 2013.
]]>Like any lazy programmer, I automated the process into a Bash script.
]]>#!/bin/bashREPOROOT="/home/inazar/code" PREFIX="foo-" POSTFIX="-bar" t=`mktemp /tmp/svnbr_XXXXX` pushd . > /dev/null cd $REPOROOT for a in `find . -maxdepth 1 -name "$PREFIX*$POSTFIX"`; do cd $a b=`svn info 2> /dev/null | grep URL | awk -F'/' '{print $NF}'` c=${a%$POSTFIX} c=${c#$PREFIX} c=${c:2} d=""# Caveat: Display will fall over with repo names past 16 chars[[ ${#c} -lt 8 ]] && d="\t" echo -e "$c\t$d $b" >> $t cd .. done popd > /dev/null sort $t rm $t
Another espresso later, John took to the task of surveying the damage. It wasn't too bad, actually: Ryan had been even more trashed, and had managed to spray the contents of the bookcase around the room. John found Tolkien behind the TV, and Dostoyevsky perched miraculously above the wall clock, but Asimov would have to wait until the pile of beer cans could be examined more closely. John wasn't feeling up to that right now.
It looked like one thing was broken: the frame around the mirror. Evidently, a coffee table had been lobbed at that wall at some point (a missing chunk of plaster, and the fine coating of gypsum on the table in question, attested to that). Amazingly, the mirror wasn't broken: the wooden frame had collapsed and lay around it, but the mirror itself was intact on the floor.
John picked up the glass, and felt that there was a slightly serrated edge on the bottom. He'd never seen the back of the mirror before, and looking closer at it revealed an intricate pattern of lines, converging on a group of 20-some parallel rows corresponding to the bumps along the bottom. John had seen something like this only a few times before, and every time it was something quite special.
An electronic circuit.
Obviously, no clue was forthcoming about what the circuit contained, or the secrets encoded within any solid-state memories that might be on the back of the mirror. John thought about enlisting Ryan's help to try to decipher the tracings, but he was still knocked out on the sofa, and would probably remain there for much of the morning.
John would have to go outside for this one.
]]>In order to keep the examples here simple, I've sought a form of program that has a simple instruction set. Many processors have a simple instruction set: the 6502 has around 150 unique codes in its architecture, which can be reduced further with careful selection of instructions. However, having written about Brainfuck before, it seems the logical choice: eight instructions, each with a unique code, and all other codes are ignored during execution.
The program to be encoded, then, is as follows:
>>++++++++[<+++[<+++++>-]>-]<<[>+>+>+>+>+<<<<<-]++++++++[>->--> --->---->-----<<<<<-]<++++[>++++++++<-]>>>>>>++++.<<<.>+++++.<< <.>>+++++.++.-.---.>.<<+++++++++.<.>>--.<------.<.>>.+++++.<<.> +.>------.>.<<<.>>>-.<+.<-.>-.<++++.>>---.<<----.>.>++++.<<-. The monkey is in the dishwasher
In order for the program to be converted to an image, the first step is to take the program as a stream of numbers. In Brainfuck, the eight operators can be treated as eight numbers to be inserted into a stream:
+ | 0x2B | - | 0x2D |
---|---|---|---|
[ | 0x5B | ] | 0x5D |
< | 0x3C | > | 0x3E |
. | 0x2E | , | 0x2C |
Hiding these values within image data would be difficult if the data were full-colour (24- or 32-bit), since the colour component would need to correspond exactly at a given point in the image. Instead of using full-colour images, it makes more sense to use palette-based images, where an 8-bit palette index refers to a table of colours defined beforehand.
A simple example of palette data would be a graded greyscale image of sixteen shades, like the one below:
If this image is produced in a standard fashion, a palette of sixteen entries and 240 blanks is saved alongside the image. By expanding the palette entries used to the maximum allotted number of 256, it's possible to use the extra entries for hiding information:
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | ||||||||||||||||
1 | ||||||||||||||||
2 | + | , | - | . | ||||||||||||
3 | < | > | ||||||||||||||
4 | ||||||||||||||||
5 | [ | ] | ||||||||||||||
6 | ||||||||||||||||
7 | ||||||||||||||||
8 | ||||||||||||||||
9 | ||||||||||||||||
A | ||||||||||||||||
B | ||||||||||||||||
C | ||||||||||||||||
D | ||||||||||||||||
E | ||||||||||||||||
F |
As can be seen above, each row of the palette is the same shade of grey, except where the Brainfuck operators map onto the palette. By changing arbitrary pixels in the image, it's possible to replace any given #2-grey pixel with, for example, a +
operator without changing the row of the palette used. Shifting pixel values along the palette row in this fashion allows us to change the value without changing the look of the image.
Before applying this concept to the example image in Figure 1, it's important to consider the type of image needed for this to work.
The major issue presented by image formats is that of compression: in order to make the image file size smaller, various methods of compression can be used to translate areas of similarity in the image to a much smaller amount of data than that represented by the raw image. Of course, when this is done the resultant compressed data bears no superficial relation to the raw image data; attempting to encode a Brainfuck program into the compressed data stream is beyond the scope of this article.
Instead, I'll be focussing on the raw image data itself; storage of the raw data can be achieved through an uncompressed data format such as TIFF. Examples of uncompressed 8-bit palette image formats include BMP and PCX; the former will be used here, due to its prevalence as a major image file format. When used as an uncompressed 8-bit format, the structure of a BMP file is as follows:
Field | Size (bytes) | Value |
---|---|---|
Signature | 2 | "BM" |
File size | 4 | Size of the full BMP |
Reserved | 4 | 0x00000000 |
Data offset | 4 | Offset of the pixel array |
BITMAPINFOHEADER | ||
Header size | 4 | 0x00000028 |
Image width | 4 | Width of the image in pixels |
Image height | 4 | Height of the image in pixels |
Colour planes | 2 | 0x0001 |
Color depth | 2 | 0x0008 |
Compression method | 4 | 0x00000000 |
Uncompressed image size | 4 | Width * height * bytes-per-pixel |
Horizontal DPM | 4 | Pixels per metre, horizontal |
Vertical DPM | 4 | Pixels per metre, vertical |
Colours used | 4 | Number of unique colours |
Important colours | 4 | Number of important colours |
Colour table (256 entries) | ||
Colour | 4 | RGBA value |
Pixel data, 1 byte per pixel |
The end result of this process is to produce a bitmap file that can be run as a Brainfuck program; to that end, we must be careful that the header data and colour table don't contain Brainfuck operators that may corrupt the initial state of the program. This is, fortunately, relatively easy: all that is needed is to pick an image width and height that don't contain an operator within the byte values, and the rest should automatically fall into place.
One other quirk of the BMP format is the order in which pixel data is stored. Most image formats treat the first row as the top row, in accordance with computer graphics principles; BMP lines up with mathematical graphing principles, and treats the bottom row as the first.
This means that the Brainfuck operators need to be encoded into the image starting at the bottom, running left-right, then moving up to the next row. By examining each operator in the program in turn, and finding the next appropriate pixel (of the same row in the palette) to replace, the following result is obtained.
Once the appropriate pixels have been replaced with versions shifted along the palette row, the palette can be amended to set the Brainfuck operators to the same colour as the rest of the row in which they reside. Doing this, and producing an uncompressed bitmap of the result, provides the following.
Running the bitmap image, shown on the right, through a Brainfuck interpreter results in the targeted message being revealed.
<?php list($c, $i) = split('!', file_get_contents($_SERVER['argv'][1])); $c = preg_replace('#[^\[\]\<\>,.+-]#', "", $c); $p = $v = 0; eval(strtr($c, array( "]" => "}", "[" => 'while($m[$p]){', "+" => '$m[$p]++;', "-" => '$m[$p]--;', ">" => '$p++;', "<" => '$p--;', "," => 'if(strlen($i)>$v)$m[$p]=ord($i[$v++]);', "." => 'echo chr($m[$p]);' ))); ?>
$ php -f brainfuck.php http://imrannazar.com/content/img/bf-bmp-final.bmp The monkey is in the dishwasher$
The process outlined above is simple enough that it can be automated: given a sixteen-step greyscale image, seek from the bottom left for a pixel in the desired palette row, replace the pixel and move forward. This, however, isn't ideal for steganographic purposes; the Brainfuck operators leave a unique signature in a bitmap of otherwise uniform numbers. There are two simple ways to increase the level of obfuscation:
Imran Nazar <tf@imrannazar.com>, Nov 2011.
]]>Cyrius, glowing a faint blue: the scientists said it was one of the brightest stars for a million light years, but out here it was a dull glow in the sky. Betel, a faraway red, constant in the sky since time immemorial. And the others, each one with a name known by all: even the alphabet had been adapted to thirty letters, one for each star in the sky.
Tonight was different. There were twenty nine stars: the letter B was missing. Betel had somehow gone out, its ruddy glow doused by the dark of space. Either something had happened to it, or the light was being... blocked in some way. The central astronomy agency soon found out that Betel wasn't missing; the latter suspicion was in fact true. A device of some sort was approaching from the rough direction of Betel, and only now was it close enough to block out the star's gaze upon us.
Morning came, and the strange entity was now visible as a blot against the iodine sky. Then it began to... expand, to unfurl, as though it were a net to cover the entire sky. Before too long, it had covered the sky in a grid of wire-thin lines. Then came a voice, deep and resonating: "SIMULATION ENDS IN TEN MINUTES."
]]>The issue of transmision encoding is obviously a solved problem: FTP has a binary mode, which avoids text-based translation, while email has MIME-formatted attachments for the inclusion of files. Another way to approach the issue, however, is to produce executables which avoid the occurrence of the problem. Since a program, like any file, is nothing more than a stream of numbers, we can alleviate the transmission issue by producing programs which only use numbers that would be found in a text file; in the ASCII encoding standard, we can define these numbers as the "printable range", between 32 and 126.
Depending on the target CPU for the executable program, the limitation of only using the printable range will have different effects: for some processors, it is impossible to write a program under these constraints. In this article, I'll be looking at the combination of the x86 PC and MS-DOS, since this allows for a wide range of instructions to be used.
Since the x86 base instruction set is derived from the 8080 CPU, a number of groups of instructions fall under the printable range, including conditional branches and arithmetic manipulation. The full list is as follows.
20/ | AND r/m,r (8-bit) | 40/@ | INC AX | 60/` | PUSHA |
---|---|---|---|---|---|
21/! | AND r/m,r (16-bit) | 41/A | INC CX | 61/a | POPA |
22/" | AND r,r/m (8-bit) | 42/B | INC DX | 62/b | BOUND r,m |
23/# | AND r,r/m (16-bit) | 43/C | INC BX | 63/c | ARPL r/m,r (16-bit) |
24/$ | AND AL,imm | 44/D | INC SP | 64/d | FS: |
25/% | AND AX,imm | 45/E | INC BP | 65/e | GS: |
26/& | ES: | 46/F | INC SI | 66/f | N/A |
27/' | DAA | 47/G | INC DI | 67/g | N/A |
28/( | SUB r/m,r (8-bit) | 48/H | DEC AX | 68/h | PUSH imm16 |
29/) | SUB r/m,r (16-bit) | 49/I | DEC CX | 69/i | IMUL r,r/m,imm16 |
2A/* | SUB r,r/m (8-bit) | 4A/J | DEC DX | 6A/j | Push imm8 |
2B/+ | SUB r,r/m (16-bit) | 4B/K | DEC BX | 6B/k | IMUL r,r/m,imm8 |
2C/, | SUB AL,imm8 | 4C/L | DEC SP | 6C/l | INSB |
2D/- | SUB AX,imm16 | 4D/M | DEC BP | 6D/m | INSW |
2E/. | CS: | 4E/N | DEC SI | 6E/n | OUTSB |
2F// | DAS | 4F/O | DEC DI | 6F/o | OUTSW |
30/0 | XOR r/m,r (8-bit) | 50/P | PUSH AX | 70/p | JO rel8 |
31/1 | XOR r/m,r (16-bit) | 51/Q | PUSH CX | 71/q | JNO rel8 |
32/2 | XOR r,r/m (8-bit) | 52/R | PUSH DX | 72/r | JC rel8 |
33/3 | XOR r,r/m (16-bit) | 53/S | PUSH BX | 73/s | JNC rel8 |
34/4 | XOR AL,imm | 54/T | PUSH SP | 74/t | JZ rel8 |
35/5 | XOR AX,imm | 55/U | PUSH BP | 75/u | JNZ rel8 |
36/6 | SS: | 56/V | PUSH SI | 76/v | JBE rel8 |
37/7 | AAA | 57/W | PUSH DI | 77/w | JNBE rel8 |
38/8 | CMP r/m,r (8-bit) | 58/X | POP AX | 78/x | JS rel8 |
39/9 | CMP r/m,r (16-bit) | 59/Y | POP CX | 79/y | JNS rel8 |
3A/: | CMP r,r/m (8-bit) | 5A/Z | POP DX | 7A/z | JP rel8 |
3B/; | CMP r,r/m (16-bit) | 5B/[ | POP BX | 7B/{ | JNP rel8 |
3C/< | CMP AL,imm | 5C/\ | POP SP | 7C/| | JL rel8 |
3D/= | CMP AX,imm | 5D/] | POP BP | 7D/} | JNL rel8 |
3E/> | DS: | 5E/^ | POP SI | 7E/~ | JLE rel8 |
3F/? | AAS | 5F/_ | POP DI |
As can be seen in the above list, a good selection of opcodes is available in the x86 printable set; this allows three possible methods for producing programs using these operations:
uuencode
utility, but is not an integrated solution.As mentioned above, the concept of Base64 can be applied in this case. Base64 encoding treats the program file as a long number, modulo 64: under this base, a number provided as a stream of bytes can be evenly divided into a stream of digits.
The canonical Base64 encoding is an adaptation of the concept behind hexadecimal: instead of using the standard denary digits in their normal position, the letters of the alphabet are used first. Values 0 to 25 are represented by the uppercase letters A-Z, 26 to 51 by the lowercase letters, and 52 to 61 by the digits 0-9. The remaining encodings 62 and 63 change between variations of the Base64 model, but are used as + and / in most variations.
This canonical encoding skips around the ASCII character set, which makes for complexity in the encoding and decoding algorithm. In the case of this article, the particular characters used for the encoding are unimportant as long as they are within the printable range: the algorithms are simplified if a contiguous range is used, so for simplicity a range from 32-95 will serve as the translation endpoint.
Encoding of the program for transmission can be performed by an external utility; the following C extract will produce a contiguous-base64 encoding of a source file.
#include <stdio.h> int main(int argc, char **argv) { FILE *in, *out; int fsize, i, n, o[4]; if(argc != 2) { printf("Usage: encode <file>\n"); return 1; } in = fopen(argv[1], "rb"); out = fopen("encode.out", "wb");/* Find out the size of the file */fseek(in, 0, SEEK_END); fsize = ftell(in); fseek(in, 0, SEEK_SET); for(i=0; i<fsize; i+=3) {/* Retrieve 24 bits from the file */n = (fgetc(in) << 0); n |= (fgetc(in) << 8); n |= (fgetc(in) << 16);/* Break out into four 6-bit values */o[0] = ((n >> 18) & 63) + 32; o[1] = ((n >> 12) & 63) + 32; o[2] = ((n >> 6) & 63) + 32; o[3] = ((n ) & 63) + 32;/* Write encoded values */fputc(o[0], out); fputc(o[1], out); fputc(o[2], out); fputc(o[3], out); } fclose(in); fclose(out); return 0; }
With the source program encoded into the printable range, the remaining part of the process is the inline decoder. In theory, this is a simple reversal of the above code; the issue is complicated by the fact that the decoder must be written using printable opcodes only. In addition to using printable opcodes, some of the opcodes in the x86 instruction set use an optional ModR/M
byte to specify their arguments: this additional byte defines the source and destination for the operation. A few examples:
XOR AX,BX; 33 C3MOV AL,BYTE [SI]; 8B 04AND CX,WORD [SI+0293]; 23 8C 93 02
As can be seen above, the ModR/M
byte can take on any value, and each of the possible values encodes to a combination of source and destination. Of course, only some of these combinations encode to printable byte values: those selections are shown below.
Destination | |||||||||
---|---|---|---|---|---|---|---|---|---|
8-bit | AL | CL | DL | BL | AH | CH | DH | BH | |
16-bit | AX | CX | DX | BX | SP | BP | SI | DI | |
Source | [BX+SI] | 20 | 28 | 30 | 38 | ||||
[BX+DI] | 21 | 29 | 31 | 39 | |||||
[BP+SI] | 22 | 2A | 32 | 3A | |||||
[BP+DI] | 23 | 2B | 33 | 3B | |||||
[SI] | 24 | 2C | 34 | 3C | |||||
[DI] | 25 | 2D | 35 | 3D | |||||
[addr] | 26 | 2E | 36 | 3E | |||||
[BX] | 27 | 2F | 37 | 3F | |||||
[BX+SI+disp] | 40 | 48 | 50 | 58 | 60 | 68 | 70 | 78 | |
[BX+DI+disp] | 41 | 49 | 51 | 59 | 61 | 69 | 71 | 79 | |
[BP+SI+disp] | 42 | 4A | 52 | 5A | 62 | 6A | 72 | 7A | |
[BP+DI+disp] | 43 | 4B | 53 | 5B | 63 | 6B | 73 | 7B | |
[SI+disp] | 44 | 4C | 54 | 5C | 64 | 6C | 74 | 7C | |
[DI+disp] | 45 | 4D | 55 | 5D | 65 | 6D | 75 | 7D | |
[BP+disp] | 46 | 4E | 56 | 5E | 66 | 6E | 76 | 7E | |
[BX+disp] | 47 | 4F | 57 | 5F | 67 | 6F | 77 |
Due to the limitations imposed by both the opcode range and the allowable addressing of ModR/M bytes, a few techniques have to be employed in code production:
AX
is zero. This value can be used throughout the program if it's saved to a register that's used solely for the purpose of providing zeroes. For example:
; Use BX for zero register PUSH AX POP BX
MOV
opcode is disallowed from use, and most immediate values cannot be represented as printable bytes, XOR can be used to clear unused bits. For example:
; MOV AX, 0013hPUSH BX POP AX XOR AX, 2020h XOR AX, 2033h; MOV AL, [SI+1Fh]DEC SI PUSH BX POP AX XOR AX, [SI+20h]
XOR
where the source byte is 65 bytes from the opcode:
; XOR AL, [SI+($-PRSTART)]DB 32h, 44h, 41h
Using these techniques, it's possible to write the reverse decoder for the printable-opcode encoder detailed above, and attach the encoded program to it. The resultant program is shown below, in NASM assembly format.
mov al, 0x13 int 0x10; Set 320x200x8 graphics modepush 0xA000 pop es; Set destination segmentlineloop: mov cx, 0x0140; For 200 linesin ax, 0x40; Get a "random" number from the timerand ax, 0xBF mul cx mov di, ax; Write to that line on screen (mod 191)in al, 0x40; Get a "random" numberrep stosb; Write a line of that colourin al, 0x60; Check for a keydec al jnz lineloop; Loop back if not ESCend: ret; Return to DOS
pusha; Save all registers for post-decoding; Initialise registers and zero-holderpush ax push ax push ax push ax push ax pop edx pop ebx pop cx; Rewrite jpouter and jpinnerxor al, jpouter-256-96; Get a printable-range 8-bit valuexor ax, 0x235B xor ax, 0x225B; Add 256push ax pop si; Use this as the addresspush bx pop ax xor al, 0x20 sub al, 0x55; AL = 0xCBxor [si+0x60], al; Opcode = 0xEB (JMP rel)xor [si+0x62], al xor al,0x34; AL = 0xFFxor [si+0x61], al; Set jump point to known valuexor [si+0x63], al inc ax; AX = 0x0100push ax pop bp; BP = 0x0100; Set destination point (CS:1000 - 32)xor ax,0x3E30 xor ax,0x3030 sub al,32 push ax pop di; Set source point (+256+8)push bx pop ax xor ax,0x3120 xor ax,0x307E push ax pop si; Perform the decodinglpouter:; In blocks of 4push bx push bx pop eax and [di+0x20],eax inc cx inc cx inc cx inc cx; Read 6 bitslpinner: push bx pop ax db 0x32, 0x44, 0x41; xor al, [si+(source-$)]sub al,32 cmp al,94 je end; Push along by 6 bits, shift in the new bitsimul edx,[di+0x20],byte 64 and [di+0x20],ebx xor [di+0x20],edx; [Dest] = Shifted valxor [di+0x20],al; [Dest] += nextinc si dec cx jnz jpinner; Once the block of 4 is done, DI += 3inc di inc di inc di jnz jpouter; Jump indirect; Jump padding, since the above code is 25 bytes longdb '@@@@@@@' end: push bp pop ax; AX = 0x0100; Rewrite jump to CS:1000xor al,jpouter-256-64 push ax pop si push bx pop ax and [si+0x60],ax and [si+0x62],ax; Clear opcodesub al,0x37 xor al,0x20; AL = 0xE9xor [si+0x60],al; JMP (rel16)push bx pop ax xor ax,0x3E30 xor ax,0x3072; AX = 0x1000 - $xor [si+0x61],ax dec di; Clear Z flagpopa; Retrieve all registersjnz jpouter+32; Jump indirect to CS:1000; Rewritable jump pointsjpouter: db 0x20 db ($-lpouter) jpinner: db 0x20 db ($-lpinner); The encoded programsource: db 'S1.P &@0N0>@Y0% OR5 X?< Y,>)JO- _F#DZG7(___#' db 126; EOF marker
The above encoded program runs in the same manner as the original, and produces the following output:
There are a few issues with this encoding mechanism. Firstly, it is heavily tied to MS-DOS .com programs, since the decoder is itself a DOS program and relies on the initial conditions provided by the MS-DOS program loader. Furthermore, it has a fixed destination for the decoded program of CS:1000h, which only allows programs of a little under 4 kilobytes to be used without modifying the decoder.
These issues, however, can be overlooked if the use of the encoding mechanism is kept to the domain of graphic demos and other simple programs which are to be written using printable opcodes. That domain may, of course, not be a large one.
Imran Nazar <tf@imrannazar.com>, Jun 2011.
]]>In the previous part of this set of articles, I began an introduction to augmented reality, using the simple example of edge detection on Android smartphones; in that part, the camera hardware was introduced, and the framework of an application developed for the use of the camera preview. In this concluding part, the edge detection algorithm itself and its implementation will be explored.
The algorithm that will be used is the Sobel operator, which works as a filter applied to each pixel in an image. The process iterates over each pixel in a row of the image, and over each row in turn, performing a factorised multiplication for each pixel value:
For Y = 1 to (Height-1) For X = 1 to (Width-1) Horiz_Sobel = (Input[Y-1][X-1] * -1) + (Input[Y-1][X] * 0) + (Input[Y-1][X+1] * 1) + (Input[Y] [X-1] * -2) + (Input[Y] [X] * 0) + (Input[Y] [X+1] * 2) + (Input[Y+1][X-1] * -1) + (Input[Y+1][X] * 0) + (Input[Y+1][X+1] * 1) Vert_Sobel = (Input[Y-1][X-1] * -1) + (Input[Y-1][X] * -2) + (Input[Y-1][X+1] * -1) + (Input[Y] [X-1] * 0) + (Input[Y] [X] * 0) + (Input[Y] [X+1] * 0) + (Input[Y+1][X-1] * 1) + (Input[Y+1][X] * 2) + (Input[Y+1][X+1] * 1) Output[Y][X] = Pythag(Horiz_Sobel, Vert_Sobel) Next X Next Y
The calculation of the Sobel operator index can be simplified in two ways:
With these modifications, the calculation can be adapted to the following.
For Y = 1 to (Height-1) For X = 1 to (Width-1) Horiz_Sobel = Input[Y+1][X+1] - Input[Y+1][X-1] + Input[Y][X+1] + Input[Y][X+1] - Input[Y][X-1] - Input[Y][X-1] + Input[Y-1][X+1] - Input[Y-1][X-1] Vert_Sobel = Input[Y+1][X+1] + Input[Y+1][X] + Input[Y+1][X] + Input[Y+1][X-1] - Input[Y-1][X+1] - Input[Y-1][X] - Input[Y-1][X] - Input[Y-1][X-1] Output[Y][X] = Clamp((Horiz_Sobel + Vert_Sobel) / 2) Next X Next Y
Before this filter can be applied to the camera preview image, the image must be taken from the camera and made ready for processing.
As introduced in Part 1, the camera hardware is capable of automatically calling a predefined function whenever a frame of the preview is ready; this function is referred to as the "preview callback", and receives a byte[]
containing the raw image data. By default, the preview image is in NV21
format, a standard luminance/chrominance format; for the example of a 320x240 pixel NV21 image:
It's relatively straightforward to perform a Sobel calculation on the luminance part of the NV21 image, and a thresholded result can be placed into the overlay canvas for each output pixel:
private int[] mFrameSobel; private void setPreviewSize(Camera.Size s) {// Allocate a 32-bit buffer as large as the previewmFrameSobel = new int[s.width * s.height]; mFrameSize = s; } private void setCamera(Camera c) { mCam = c; mCam.setPreviewCallback(new PreviewCallback() {// Called by camera hardware, with preview framepublic void onPreviewFrame(byte[] frame, Camera c) { Canvas cOver = mOverSH.lockCanvas(null); try { int x, y; int w = mFrameSize.width, pos; int sobelX, sobelY, sobelFinal; for(y=1; y<(mFrameSize.height-1); y++) { pos = y * w + 1; for(x=1; x<(mFrameSize.width-1); x++) { sobelX = frame[pos+w+1] - frame[pos+w-1] + frame[pos+1] + frame[pos+1] - frame[pos-1] - frame[pos-1] + frame[pos-w+1] - frame[pos-w-1]; sobelY = frame[pos+w+1] + frame[pos+w] + frame[pos+w] + frame[pos+w-1] - frame[pos-w+1] - frame[pos-w] - frame[pos-w] - frame[pos-w-1]; sobelFinal = (sobelX + sobelY) / 2;// Threshold at 48 (for example)if(sobelFinal < 48) sobelFinal = 0; if(sobelFinal >= 48) sobelFinal = 255;// Build a 32-bit RGBA value, either // transparent black or opaque whitemFrameSobel[pos] = (sobelFinal << 0) + (sobelFinal << 8) + (sobelFinal << 16) + (sobelFinal << 24); } }// Copy calculated frame to bitmap, then // translate onto overlay canvasRect src = new Rect(0, 0, mFrameSize.width, mFrameSize.height); Rect dst = new Rect(0, 0, cOver.getWidth(), cOver.getHeight()); Paint pt = new Paint(); Bitmap bmp = Bitmap.createBitmap(mFrameSobel, mFrameSize.width, mFrameSize.height, Bitmap.Config.ARGB_8888); pt.setColor(Color.WHITE); pt.setAlpha(0xFF); cOver.drawBitmap(bmp, src, dst, pt); } catch(Exception e) {// Log/trap rendering errors} finally { mOverSH.unlockCanvasAndPost(cOver); } } }); }
The above code, when run as part of the camera preview, yields the following view.
As written, there's a problem with this application: speed. When run on a hardware device, the overlay calculation is incapable of maintaining a near-real-time speed of augmented display; in the case of my own hardware, a rendering speed of around 3 frames per second was achieved. This is due, in the main, to the calculations being performed within a buffer of managed memory in the Dalvik virtual machine: every access to the camera preview data is checked for boundary conditions, as is every pixel value written to the overlay canvas. All of these checks for boundary conditions take time away from the Sobel operation.
To alleviate this issue, the calculation can be performed in native code bypassing the virtual machine; this is done through the Android Native Development Kit (NDK). The NDK is an implementation of the Java Native Interface (JNI), and as such behaves in a very similar way to standard JNI: native code is placed into functions conforming to a particular naming standard, and they can then be called from the Java VM as specially marked native
functions.
NDK native functions are named according to the package and class they're destined for: the standard format is Java_<package>_<class>_<function>
. In this particular case, the destination is package sobel
and class OverlayView
, so the interface can be built as below.
#include <jni.h> JNIEXPORT void JNICALL Java_sobel_OverlayView_nativeSobel(/* Two parameters passed to every JNI function */JNIEnv *env, jobject this,/* Four parameters specific to this function */jbyteArray frame, jint width, jint height, jobject out) {/* Perform Sobel operation, filling "out" */}
class OverlayView { private native void nativeSobel(byte[] frame, int width, int height, IntBuffer out); }
Note that in the above code, the int[]
array used beforehand for overlay output has been replaced by an IntBuffer
; this is to allow access to the raw memory buffer for native work, since a standard int[]
has memory allocated by the JVM, and cannot be written to by the JNI. Buffer
s are designed to allow direct access to the buffer memory through the object's GetDirectBufferAddress
function, which we can use for writing the output of the Sobel operation.
The Java code shown above for the operation can be translated directly to C code, as below:
#include <jni.h> JNIEXPORT void JNICALL Java_sobel_OverlayView_nativeSobel( JNIEnv *env, jobject this, jbyteArray frame, jint width, jint height, jobject out) {/* Get a pointer to the raw output buffer */jint *dest_buf = (jint*) ((*env)->GetDirectBufferAddress(env, out));/* Get a pointer to (probably a copy of) the input */jboolean frame_copy; jint *src_buf = (*env)->GetByteArrayElements(env, frame, &frame_copy); int x, y, w = width, pos = width+1; int maxX = width-1, maxY = height-1; int sobelX, sobelY, sobelFinal; for(y=1; y<maxY; y++, pos+=2) { for(x=1; x<maxX; x++, pos++) { sobelX = src_buf[pos+w+1] - src_buf[pos+w-1] + src_buf[pos+1] + src_buf[pos+1] - src_buf[pos-1] - src_buf[pos-1] + src_buf[pos-w+1] - src_buf[pos-w-1]; sobelY = src_buf[pos+w+1] + src_buf[pos+w] + src_buf[pos+w] + src_buf[pos+w-1] - src_buf[pos-w+1] - src_buf[pos-w] - src_buf[pos-w] - src_buf[pos-w-1]; sobelFinal = (sobelX + sobelY) >> 1; if(sobelFinal < 48) sobelFinal = 0; if(sobelFinal >= 48) sobelFinal = 255; dest_buf[pos] = (sobelFinal << 0) | (sobelFinal << 8) | (sobelFinal << 16) | (sobelFinal << 24); } } }
private IntBuffer mFrameSobel; private void setPreviewSize(Camera.Size s) {// Allocate a 32-bit direct buffer as large as the previewmFrameSobel = ByteBuffer.allocateDirect(s.width * s.height * 4) .asIntBuffer(); mFrameSize = s; } private void setCamera(Camera c) { mCam = c; mCam.setPreviewCallback(new PreviewCallback() {// Called by camera hardware, with preview framepublic void onPreviewFrame(byte[] frame, Camera c) { Canvas cOver = mOverSH.lockCanvas(null); try { nativeSobel(frame, mFrameSize.width, mFrameSize.width, mFrameSobel);// Rewind the array after operationmFrameSobel.position(0); Rect src = new Rect(0, 0, mFrameSize.width, mFrameSize.height); Rect dst = new Rect(0, 0, cOver.getWidth(), cOver.getHeight()); Paint pt = new Paint(); Bitmap bmp = Bitmap.createBitmap(mFrameSobel, mFrameSize.width, mFrameSize.height, Bitmap.Config.ARGB_8888); pt.setColor(Color.WHITE); pt.setAlpha(0xFF); cOver.drawBitmap(bmp, src, dst, pt); } catch(Exception e) {// Log/trap rendering errors} finally { mOverSH.unlockCanvasAndPost(cOver); } } }); }
Once the Java code has been configured to call the native function for processing, the lack of extraneous work by the JVM results in a significant speed-up: under testing on my hardware, a speed of 15-20 frames per second was easily achievable, and this can be improved through further optimisation of the algorithm.
The Android documentation for the NDK states:
"Using native code does not result in an automatic performance increase, but always increases application complexity."
In the case of the memory-intensive processing presented here, the NDK has a significant advantage over the Java virtual machine, in that it doesn't perform bounds checking on array and pointer accesses. Since most augmented reality applications will need to work on the camera preview image, and provide an overlay on top of the preview, the technique of shunting processing into an NDK function can be useful.
Imran Nazar <tf@imrannazar.com>, May 2011.
]]>One of the most demanding tasks for a smartphone application to take on is "augmented reality": producing a display of the world with information overlaid in real-time. This is generally done by using the smartphone's camera, in preview mode, to provide a base for a translucent overlay; the intensity of the task lies in calculating the contents of the overlay in a time-sensitive environment.
This article hopes to provide a gentle two-part introduction to augmented reality as implemented on Android-based smartphone devices. The process will be introduced using the example of an edge detector run on the camera's current view, and updated alongside the camera view in real-time. Many of the processes involved in producing such a view will apply to any software that seeks to provide a view based on the camera, so the code presented here will have wider application to programs of this class.
The edge detection algorithm that will be used in this article is the Sobel operator; the algorithm will be covered in detail later, but the application developed here will, as a whole, be named after this operator. An example output for the application is shown below.
In order to overlay data on the camera preview screen, it's a prerequisite to be able to display the camera preview; this is done by rendering the preview onto a surface. For that to occur, the simplest method is to place a SurfaceView
-type view on the application's main layout, and position it such that it covers the screen. This can be done through the standard layout XML:
<?xml version="1.0" encoding="utf-8"?><FrameLayout xmlns:android="http://schemas.android.com/apk/res/android" android:orientation="vertical" android:layout_width="fill_parent" android:layout_height="fill_parent"> <SurfaceView android:id="@+id/surface_camera" android:layout_width="fill_parent" android:layout_height="fill_parent" /> </FrameLayout>
With a SurfaceView
made available, the application's main activity can place a surface and its associated canvas onto the view. To do this, the application needs to act as a SurfaceHolder
, and implement the methods of a SurfaceHolder.Callback
; this allows the Android operating system to treat the activity as an end-point for rendering surfaces. In code, it's a simple process to define an activity as a surface holder callback: three methods are made available by the SurfaceHolder.Callback
interface.
package sobel; public class Sobel extends Activity implements SurfaceHolder.Callback {/* Activity event handlers */// Called when activity is initialised by OS@Override public voidonCreate(Bundle inst) { super.onCreate(inst); setContentView(R.layout.main);// Initialise camerainitCamera(); }// Called when activity is closed by OS@Override public void onDestroy() {// Turn off the camerastopCamera(); }/* SurfaceHolder event handlers */// Called when the surface is first createdpublic void surfaceCreated(SurfaceHolder sh) {// No action required}// Called when surface dimensions etc changepublic void surfaceChanged(SurfaceHolder sh, int format, int width, int height) {// Start camera previewstartCamera(sh, width, height); }// Called when the surface is closed/destroyedpublic void surfaceDestroyed(SurfaceHolder sh) {// No action required} }
The above code will deal with the initialisation of the application and its surface, but the camera hardware needs to be initialised and setup for the preview to be available. This is done in three steps:
The camera helper functions mentioned in the above code sample can be filled in to perform these steps:
private Camera mCam; private SurfaceView mCamSV; private SurfaceHolder mCamSH;// Initialise camera and surfaceprivate void initCamera() { mCamSV = (SurfaceView)findViewById(R.id.surface_camera); mCamSH = mCamSV.getHolder(); mCamSH.addCallback(this); mCam = Camera.open(); }// Setup camera based on surface parametersprivate void startCamera(SurfaceHolder sh, int width, int height) { Camera.Parameters p = mCam.getParameters(); p.setPreviewSize(width, height); mCam.setParameters(p); try { mCam.setPreviewDisplay(sh); } catch(Exception e) {// Log surface setting exceptions} mCam.startPreview(); }// Stop camera when application endsprivate void stopCamera() { mCamSH.removeCallback(this); mCam.stopPreview(); mCam.release(); }
One consideration to make when setting up the camera is that the size of the surface prepared for preview may not be a size supported by the camera subsystem. If this is the case, and the activity attempts to set a preview size based on the surface size, the application may force-close when it starts. A work-around for this is not to use the surface's dimensions when setting a preview size, but instead to ask the camera which preview sizes are supported, and to use one of those. The list of preview sizes can be retrieved through the camera's Parameters
object:
private void startCamera(SurfaceHolder sh, int width, int height) { Camera.Parameters p = mCam.getParameters(); for(Camera.Size s : p.getSupportedPreviewSizes()) {// In this instance, simply use the first available // preview size; could be refined to find the closest // values to the surface sizep.setPreviewSize(s.width, s.height); break; } mCam.setParameters(p); try { mCam.setPreviewDisplay(sh); } catch(Exception e) {// Log surface setting exceptions} mCam.startPreview(); }
The application is now equipped to produce a preview of the camera's current field of view. The preview may appear alongside an application title bar, notification area and so forth; to remove these and gain an unobstructed rendering of the preview, the application can request to be made fullscreen:
@Override public void onCreate(Bundle inst) { super.onCreate(inst); getWindow().setFlags(WindowManager.LayoutParams.FLAG_FULLSCREEN, WindowManager.LayoutParams.FLAG_FULLSCREEN); setContentView(R.layout.main); initCamera(); }
Now that the camera preview is being rendered into a SurfaceView
, the next step in augmented reality is the ability to draw pixels and/or shapes over the preview image. Since the camera hardware is directly drawing to the surface made available to it, this surface cannot be used for additional drawing: any output made to the surface will be automatically overwritten by the camera.
This problem can be resolved by providing an additional surface, positioned over the top of the camera preview, onto which things can be drawn by the application. The new surface can also be a SurfaceView
, but if the base Android view is utilised in this instance, it cannot be used to draw dynamic content: the SurfaceView
must be extended into a new class. For the purposes of this application, the class can be referred to as OverlayView
:
package sobel; public class OverlayView extends SurfaceView { private SurfaceHolder mOverSH; public void OverlayView(Context ctx, AttributeSet attr) { super(ctx, attr); mOverSH = getHolder(); } }
private OverlayView mOverSV; private void initCamera() { mCamSV = (SurfaceView)findViewById(R.id.surface_camera); mCamSH = mCamSV.getHolder(); mCamSH.addCallback(this); mCam = Camera.open(); mOverSV = (OverlayView)findViewById(R.id.surface_overlay); mOverSV.getHolder().setFormat(PixelFormat.TRANSLUCENT); mOverSV.setCamera(mCam); } private void startCamera(SurfaceHolder sh, int width, int height) { Camera.Parameters p = mCam.getParameters(); for(Camera.Size s : p.getSupportedPreviewSizes()) { p.setPreviewSize(s.width, s.height); mOverSV.setPreviewSize(s); break; }// ...}
In order to lay this new view class over the camera's preview surface, the layout XML needs to be modified to load in the overlay view beforehand:
<?xml version="1.0" encoding="utf-8"?><FrameLayout xmlns:android="http://schemas.android.com/apk/res/android" android:orientation="vertical" android:layout_width="fill_parent" android:layout_height="fill_parent"> <sobel.OverlayView android:id="@+id/surface_overlay" android:layout_width="fill_parent" android:layout_height="fill_parent" /> <SurfaceView android:id="@+id/surface_camera" android:layout_width="fill_parent" android:layout_height="fill_parent" /> </FrameLayout>
With an overlay in place, the content on the overlay needs to be drawn, and regularly updated. Drawing onto a surface is a familiar concept from computer graphics, requiring the locking of a canvas and the drawing of primitives to the canvas; keeping the canvas regularly updated against the camera preview is a little less familiar. A regular update can be achieved in one of two ways:
byte[]
of the contents of the camera preview, which can easily be used for calculation of an overlay.To set up a callback to a method in the OverlayView, the view must first know about the camera: a handle to the camera must be passed over from the main activity. In addition, it's useful for the OverlayView to know the size of preview image it's working with, since the callback method doesn't provide dimensions. The calls to these methods can be seen in the above code sample from Sobel.java
, made at initialisation time; the methods are outlined below.
private Camera mCam; private Camera.Size mFrameSize;// Called by Sobel.surfaceChanged, to set dimensionsprivate void setPreviewSize(Camera.Size s) { mFrameSize = s; mFrameCount = 0; }// Called by Sobel.initCamera, to set callbackprivate void setCamera(Camera c) { mCam = c; mCam.setPreviewCallback(new PreviewCallback() { private int mFrameCount;// Called by camera hardware, with preview framepublic void onPreviewFrame(byte[] frame, Camera c) { Canvas cOver = mOverSH.lockCanvas(null); try {// Perform overlay rendering here // Here, draw an incrementing number onscreenPaint pt = new Paint(); pt.setColor(Color.WHITE); pt.setTextSize(16); cOver.drawText(Integer.toString(mFrameCount++), 10, 10, pt); } catch(Exception e) {// Log/trap rendering errors} finally { mOverSH.unlockCanvasAndPost(cOver); } } }); }
Running the above code on hardware results in something akin to the following image:
The above code takes the application to a point where it can retrieve data from the camera preview (through the preview frame callback's byte
[] parameter), and render an overlay. In the second part of this article, I'll look at how the preview data can be run through the Sobel edge detection filter, and how the result can be displayed on the overlay.
Imran Nazar <tf@imrannazar.com>, Apr 2011.
]]>This is part 10 of an article series on emulation development in JavaScript; ten parts are currently available, and others are expected to follow.
Since the first computers were put together, one of their basic functions has been to keep time: to coordinate actions according to timers. Even the simplest of games has an element of time to it: Pong, for example, needs to move the ball across the screen at a particular rate. In order to handle these timing issues, every games console has some form of timer to allow for things to happen at a given moment, or at a specific rate.
The GameBoy is no exception to this rule, and contains a set of registers which automatically increment based on a programmable schedule. In this part of the series, I'll be investigating the structure and operation of the timer, and how it can be used to seed pseudo-random number generators, such as the one contained in Tetris and its various clones. One example of a Tetris clone which uses the timer, to pick random pieces for the game, is demonstrated below.
The GameBoy's CPU, as described in the first part of this series, runs on a 4,194,304Hz clock, with two internal measures of the time taken to execute each instruction: the T-clock, which increments with each clock step, and the M-clock, which increments at a quarter of the speed (1,048,576Hz). These clocks are used as the source of the timer, which counts up, in turn, at a quarter of the rate of the M-clock: 262,144Hz. In this article, I'll refer to this final value as the timer's "base speed".
The GameBoy's timer hardware offers two separate timer registers: the system works by incrementing the value in each of these registers at a pre-determined rate. The "divider" timer is permanently set to increment at 16384Hz, one sixteenth of the base speed; since it's only an eight-bit register, its value will go back to zero after it reaches 255. The "counter" timer is more programmable: it can be set to one of four speeds (the base divided by 1, 4, 16 or 64), and it can be set to go back to a value that isn't zero when it overflows past 255. In addition, the timer hardware will send an interrupt to the CPU, as described in part 8, whenever the "counter" timer does overflow.
There are four registers used by the timer; these are made available for use by the system as part of the I/O page, just like the graphics and interrupt registers:
Address | Register | Details | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0xFF04 | Divider | Counts up at a fixed 16384Hz; reset to 0 whenever written to | ||||||||||||
0xFF05 | Counter | Counts up at the specified rate Triggers INT 0x50 when going 255->0 | ||||||||||||
0xFF06 | Modulo | When Counter overflows to 0, it's reset to start at Modulo | ||||||||||||
0xFF07 | Control |
|
Since the "counter" timer triggers an interrupt when it overflows, it can be especially useful if a game requires something to happen at a regular interval. However, a Gameboy game can generally use the vertical blank to much the same effect, since it occurs at a regular pace of almost 60Hz; the vertical blanking handler can be used not only to refresh the screen contents, but to check the keypad and update the game state. Therefore, there's little call for use of the timer in traditional Gameboy games, though it can be used to greater effect in graphic demos.
The emulation developed in this article series uses the CPU's clock as the basic unit of time. For that reason, it's simplest to maintain a clock for the timer that runs in step with the CPU clock, and is updated by the dispatch function. It's convenient at this stage to keep the DIV register as a separate entity to the controllable timer, incremented at 1/16th the rate again of the fastest timer step:
TIMER = { _clock: { main: 0, sub: 0, div: 0 }, _reg: { div: 0, tima: 0, tma: 0, tac: 0 }, inc: function() {// Increment by the last opcode's timeTIMER._clock.sub += Z80._r.m;// No opcode takes longer than 4 M-times, // so we need only check for overflow onceif(TIMER._clock.sub >= 4) { TIMER._clock.main++; TIMER._clock.sub -= 4;// The DIV register increments at 1/16th // the rate, so keep a count of thisTIMER._clock.div++; if(TIMER._clock.div == 16) { TIMER._reg.div = (TIMER._reg.div+1) & 255; TIMER._clock.div = 0; } }// Check whether a step needs to be made in the timerTIMER.check(); } };
while(true) {// Run execute for this instructionvar op = MMU.rc(Z80._r.pc++); Z80._map[op](); Z80._r.pc &= 65535; Z80._clock.m += Z80._r.m; Z80._clock.t += Z80._r.t;// Update the timerTIMER.inc(); Z80._r.m = 0; Z80._r.t = 0;// If IME is on, and some interrupts are enabled in IE, and // an interrupt flag is set, handle the interruptif(Z80._r.ime && MMU._ie && MMU._if) {// Mask off ints that aren't enabledvar ifired = MMU._ie & MMU._if; if(ifired & 0x01) { MMU._if &= (255 - 0x01); Z80._ops.RST40(); } } Z80._clock.m += Z80._r.m; Z80._clock.t += Z80._r.t;// Update timer again, in case a RST occurredTIMER.inc(); }
From here, the controllable timer is made up of varying divisions of the base speed, making it relatively simple to check whether the timer values need to be stepped up, and to provide the registers as part of the memory I/O page. The interface between the following section of code and the MMU I/O page handler, is left as an exercise for the reader.
check: function() { if(TIMER._reg.tac & 4) { switch(TIMER._reg.tac & 3) { case 0: threshold = 64; break;// 4Kcase 1: threshold = 1; break;// 256Kcase 2: threshold = 4; break;// 64Kcase 3: threshold = 16; break;// 16K} if(TIMER._clock.main >= threshold) TIMER.step(); } }, step: function() {// Step the timer up by oneTIMER._clock.main = 0; TIMER._reg.tima++; if(TIMER._reg.tima > 255) {// At overflow, refill with the ModuloTIMER._reg.tima = TIMER._reg.tma;// Flag a timer interrupt to the dispatcherMMU._if |= 4; } }, rb: function(addr) { switch(addr) { case 0xFF04: return TIMER._reg.div; case 0xFF05: return TIMER._reg.tima; case 0xFF06: return TIMER._reg.tma; case 0xFF07: return TIMER._reg.tac; } }, wb: function(addr, val) { switch(addr) { case 0xFF04: TIMER._reg.div = 0; break; case 0xFF05: TIMER._reg.tima = val; break; case 0xFF06: TIMER._reg.tma = val; break; case 0xFF07: TIMER._reg.tac = val & 7; break; } }
A major component of many games is unpredictability: Tetris, for instance, will throw an unknown pattern of pieces down the well, and the game consists of building rows using these pieces. Ideally, a computer provides unpredictability by generating random numbers, but this runs contrary to the methodical nature of a computer; it's not possible for a computer to provide a truly random pattern of numbers. Various algorithms exist to produce sequences of numbers that look superficially like they're random, and these are called pseudo-random number generation (PRNG) algorithms.
A PRNG is generally implemented as a formula that, given a particular input number, will produce another number with almost no relation to the input. For Tetris, nothing so complicated is required; instead, the following code is used to produce a seemingly random block.
BLK_NEXT = 0xC203 BLK_CURR = 0xC213 REG_DIV = 0x04 NBLOCK: ld hl, BLK_CURR; Bring the next blockld a, (BLK_NEXT); forward to currentld (hl),a and 0xFC; Clear out any rotationsld c,a; and hold onto previousld h,3; Try the following 3 times.seed: ldh a, (REG_DIV); Get a "random" seedld b,a .loop: xor a; Step down in sevens.seven: dec b; until zero is reachedjr z, .next; This loop is equivalentinc a; to (a%7)*4inc a inc a inc a cp 28 jr z, .loop jr .seven .next: ld e,a; Copy the new valuedec h; If this is thejr z, .end; last try, just use thisor c; Otherwise checkand 0xFC; against the previous blockcp c; If it's the same again,jr z, .seed; try another random number.end: ld a,e; Get the copy backld (BLK_NEXT), a; This is our next block
The basis of the Tetris block selector is the DIV register: since the selection routine is only run once every few seconds, the register will have an unknown value on any given run, and it thus makes a fair approximation of a random number source. With the timer system having been emulated, Tetris and its clones can be emulated to full functionality, as shown in Figure 1.
One aspect of game emulation which has been overlooked until now is the generation of sound, and the synchronisation of sound to the speed of the emulation. Over and above the aspect of sound generation by the emulator, is the method by which sound is output to the browser; the next part of this series will investigate the issues surrounding sound output mechanisms, and whether a coherent strategy can be put together for sound production in JavaScript.
Imran Nazar <tf@imrannazar.com>, Feb 2011.
]]>The black hole changed that. It was spotted a few years ago, slowly drifting across the plane of the solar system; it was only seen at all because it crossed the path of Venus, and the planet changed shape for a few hours. In their haste to focus attacks on Earth, the invaders missed the shift in the appearance of Venus, and we got there first.
With so much detritus up in high orbit, and so many people based on the Atlantic tower platform, it didn't take long for an old warship module to be staffed out with a skeleton crew, and pushed off in the direction of the black hole. The crew ended up being us, the second-shift engineering staff of the Nagios. The warship was a decrepit 301 with no gun batteries to speak of, but it did have one thing in its favour: the experimental coil we were dispatched to try out.
I took on the role of pilot, simply because I'd done some strong gravity training back at the Academy. Jackson was our chief engineer, and Fuller had the day off, so he got roped in too. Three of us, and a crazy theory, to try to change the war.
The theory was simple, when compared to others in the field of quantum relativity. A black hole has a gravitational field, which radiates out in lines, much like a magnetic field. Black holes also spin, which mean the lines of gravitational field are moving all the time. Vassilev theorised that a superconducting coil placed around the hole, near the Schwarzschild radius, would cut through the field lines and generate an electrogravitic field in the coil. It was impossible to even attempt a confirmation of the theory at the time, since there were no black holes, and no superconductive material to spare.
With the war, the situation was different. Even our old and broken 301 had a few miles of ceramic-wire running to the weapons systems, which could carry incredible amounts of charge. Measurements of the black hole mass put it at around one thousandth of the sun's mass, which left the hole's Schwarzschild radius at 30 metres: there would be plenty of wire for the coil. Getting the coil into place, that was my job as ship's pilot.
As we came up towards Venus, I called down from the pilot's chair, into the pit. "Jackson, get the ceramic down the back; let's give this hole a good run."
It's not easy to steer a course around any strong gravity source, as we'd all found out when the Weichinger carrier came too close to an enemy neutron barrage: it had been torn clean in half by the tidal forces. A black hole was more predictable, but much scarier at the same time. At least with a neutron barrage, it was possible to come out the other side.
I'd done this kind of thing in simulators before, with a whole term of spaceflight at the Academy focussed on getting around when strong gravity was in the area. One thing you never expect is that the outside universe gets smaller. We were close enough to the hole now, that I could barely see the Earth as a small point in what seemed like the far distance, and stars were beginning to show trails as we moved.
The 301 wasn't a highly manouverable warship at the height of its career, over thirty years ago: it was often referred to as the Bucket Class by the current cadets of the Academy. Moving around near a black hole was another adventure, with the constant tug to the left as I wound the ship anti-clockwise around the point.
"For what it's worth," Fuller called up from the cargo hold, "we're nearly out of gas for the manouvering thrusters."
"We're low on everything," I said in reply. "No worries, I'm nearly done here."
The hole was at least uniform, so it was pretty short work to get the coil deployed in four loops around its equator. Jackson came up the ladder to the viewscreen, while Fuller stayed down in the hold to get the coil's ends into a power meter. Based on the shape of the black hole, and how stationary I could keep the ship against it, we were looking at maybe three hundred gigawatts: not bad for a proof of concept.
Fuller fired up the meter, and there was silence for a few seconds. "Er. Jackson, Irvine? You'll want to see this."
"I'm holding the ship straight," I answered. "Do I need to come down there to see what the meter says?"
"Well, it's off the scale, and the scale goes up to a terawatt."
Jackson looked down into the pit with a slightly incredulous look. "Nah, that can't be right. Lemme get down there, I'll check the wire."
Even superconducting ceramic heats up when enough charge flows through it. Jackson put his hand against the coil, and drew it back quickly. "That's, er, warm. You might want to get someone else out here, Irvine; we might have more than we bargained for here."
That's how it started, of course. Eventually, we had wires criss-crossing that hole and pulling out around ten terawatts: more than enough to feed a new generation of powered weapons. Thanks to the hole, we fought the invaders off, and Earth managed to claw its way out into the light.
A few years after the war ended, Stan Vassilev was awarded the Peace Prize, which he found suitably ironic. It was his opinion that we'd bled our new power source dry, and it would soon evaporate; a month later, the hole vanished, and the 10TW of power went with it.
I still don't know where it came from, and why it showed up to help Earth when it did. Religion's outlawed by the military, so I can't be a religious man, but I do wonder sometimes. Even as an engineer, it strikes me as odd that Earth was dealt such a massive stroke of luck: someone out there's looking after our interests.
]]>This is part 9 of an article series on emulation development in JavaScript; ten parts are currently available, and others are expected to follow.
Thus far in this series, we've been dealing with the loading and emulation of a simple memory map for the GameBoy, with the entirety of the game ROM fitting into the lower half of memory. There aren't many games that fit into memory in full (Tetris is one of the few); most games are larger than this, and have to employ an independent mechanism to swap "banks" of game ROM into the GameBoy CPU's view.
Some of the first games in the GameBoy library were built with a Memory Bank Controller inside the cartridge, which did this job of swapping banks of ROM into view; over the years, various versions of the cartridge MBC were built for increasingly large games. In the particular example of the demo associated with this part, the first version of the MBC is used to handle the loading of a 64kB ROM.
Through the years, many computer systems have had to deal with the problem of having too much program to fit into memory. Traditionally, there have been two ways to deal with this problem.
Since the GameBoy is a fixed hardware platform with wide distribution, there's no way to increase the address space when larger games are produced; instead, the Memory Bank Controller built into the cartridge offers a way to switch 16kB banks of ROM into view. In addition to this, the MBC1 supports up to 32kB of "external RAM", which is writable memory in the cartridge; this can be banked into the [A000-BFFF]
space in the memory map, if it's available.
In order to facilitate software that uses the MBC1, the first 16kB bank of ROM (bank 0) is fixed at address 0000
; the second half of the ROM space can be made into a window on any ROM bank between 1 and 127, for a maximum ROM size of 2048kB. One of the oddities of the MBC1 is that it deals internally in 32's: banks #32, #64 and #96 are inaccessible, since they're treated within the banking system as bank #0. This means that 125 banks apart from the fixed bank #0 are usable.
There are four registers within the MBC1 chip, that allow for switching of banks for the ROM and RAM; these can be changed by writing to the (normally read-only) ROM space anywhere within a certain range. The details are given in the below table.
Locations | Register | Details |
---|---|---|
0000-1FFF | Enable external RAM | 4 bits wide; value of 0x0A enables RAM, any other value disables |
2000-3FFF | ROM bank (low 5 bits) | Switch between banks 1-31 (value 0 is seen as 1) |
4000-5FFF | ROM bank (high 2 bits) RAM bank | ROM mode: switch ROM bank "set" {1-31}-{97-127} RAM mode: switch RAM bank 0-3 |
6000-7FFF | Mode | 0: ROM mode (no RAM banks, up to 2MB ROM) 1: RAM mode (4 RAM banks, up to 512kB ROM) |
Since there are multiple kinds of controller for banking, any given game must state which MBC is used, in the cartridge header data. This is the first chunk of data in the cartridge ROM, and follows a specific format.
Location(s) | Value | Size (bytes) | Details |
---|---|---|---|
0100-0103h | Entry point | 4 | Where the game starts Usually "NOP; JP 0150h" |
0104-0133h | Nintendo logo | 48 | Used by the BIOS to verify checksum |
0134-0143h | Title | 16 | Uppercase, padded with 0 |
0144-0145h | Publisher | 2 | Used by newer GameBoy games |
0146h | Super GameBoy flag | 1 | Value of 3 indicates SGB support |
0147h | Cartridge type | 1 | MBC type/extras |
0148h | ROM size | 1 | Usually between 0 and 7 Size = 32kB << [0148h] |
0149h | RAM size | 1 | Size of external RAM |
014Ah | Destination | 1 | 0 for Japan market, 1 otherwise |
014Bh | Publisher | 1 | Used by older GameBoy games |
014Ch | ROM version | 1 | Version of the game, usually 0 |
014Dh | Header checksum | 1 | Checked by BIOS before loading |
014E-014Fh | Global checksum | 2 | Simple summation, not checked |
0150h | Start of game |
In this particular case, we're interested in the value of 0147h
, the cartridge type. The cartridge type can be one of the following values, if an MBC1 is fitted to the cartridge:
Value | Definition |
---|---|
00h | No MBC |
01h | MBC1 |
02h | MBC1 with external RAM |
03h | MBC1 with battery-backed external RAM |
For the purposes of this article, a system of battery backing will not be implemented for the external RAM; this feature is often used by games to save their state for later use, and will be looked at in more detail in a later part.
The memory bank controllers are an obvious manipulation of memory, and thus fit neatly into the MMU. Since the first ROM bank (bank #0) is fixed, an offset need only be maintained for the MBC to indicate where it's reading for the second bank. In order to allow for more MBC handling to be added later, an array of data can be used to hold the state of a given controller:
MMU = {// MBC states_mbc: [],// Offset for second ROM bank_romoffs: 0x4000,// Offset for RAM bank_ramoffs: 0x0000,// Copy of the ROM's cartridge-type value_carttype: 0, reset: function() { ...// In addition to previous reset code, // initialise MBC internal dataMMU._mbc[0] = {}; MMU._mbc[1] = { rombank: 0,// Selected ROM bankrambank: 0,// Selected RAM bankramon: 0,// RAM enable switchmode: 0// ROM/RAM expansion mode}; MMU._romoffs = 0x4000; MMU._ramoffs = 0x0000; }, load: function(file) { ... MMU._carttype = MMU._rom.charCodeAt(0x0147); } }
As can be seen in the above code, the internal state of the MBC1's four registers is represented by an object within the MMU, associated with MBC type 1. When these are changed, the ROM and RAM offsets can be modified to point into the appropriate bank of memory; once the pointers are set, access to the memory can proceed almost as normal.
MMU = { rb: function(addr) { switch(addr & 0xF000) { ...// ROM (switched bank)case 0x4000: case 0x5000: case 0x6000: case 0x7000: return MMU._rom.charCodeAt(MMU._romoffs + (addr & 0x3FFF));// External RAMcase 0xA000: case 0xB000: return MMU._eram[MMU._ramoffs + (addr & 0x1FFF)]; } } };
The calculation of these pointer offsets is performed when the MBC registers are written, as shown below.
wb: function(addr, val) { switch(addr & 0xF000) {// MBC1: External RAM switchcase 0x0000: case 0x1000: switch(MMU._carttype) { case 2: case 3: MMU._mbc[1].ramon = ((val & 0x0F) == 0x0A) ? 1 : 0; break; } break;// MBC1: ROM bankcase 0x2000: case 0x3000: switch(MMU._carttype) { case 1: case 2: case 3:// Set lower 5 bits of ROM bank (skipping #0)val &= 0x1F; if(!val) val = 1; MMU._mbc[1].rombank = (MMU._mbc[1].rombank & 0x60) + val;// Calculate ROM offset from bankMMU._romoffs = MMU._mbc[1].rombank * 0x4000; break; } break;// MBC1: RAM bankcase 0x4000: case 0x5000: switch(MMU._carttype) { case 1: case 2: case 3: if(MMU._mbc[1].mode) {// RAM mode: Set bankMMU._mbc[1].rambank = val & 3; MMU._ramoffs = MMU._mbc[1].rambank * 0x2000; } else {// ROM mode: Set high bits of bankMMU._mbc[1].rombank = (MMU._mbc[1].rombank & 0x1F) + ((val & 3) << 5); MMU._romoffs = MMU._mbc[1].rombank * 0x4000; } break; } break;// MBC1: Mode switchcase 0x6000: case 0x7000: switch(MMU._carttype) { case 2: case 3: MMU._mbc[1].mode = val & 1; break; } break; ...// External RAMcase 0xA000: case 0xB000: MMU._eram[MMU._ramoffs + (addr & 0x1FFF)] = val; break; } }
In the above control code, instances of MBC1 that are stated as having external RAM attached are the ones which have RAM banking. With this code in place, the demo shown in Figure 1 loads and runs properly; without the MBC1 handler, the code would crash while attempting to access sprite and background data for the display.
Aside from being able to fit larger games into memory, one of the more important aspects of a game is the ability to keep time: a clock-based game, for example, is useless without some kind of timing mechanism on which to base its clock. As mentioned previously, many games use the vertical blanking interrupt for this timing, but some require a finer-grained time structure; this is provided in the GameBoy by a hardware timer, tied into the CPU clock.
The timer also provides a method of examining the CPU clock, which makes it useful as a seed for random number generators; Tetris, for example, picks its blocks using this functionality of the hardware timer. In the next part, I'll look at the details of how the timer works, and how it can be implemented.
Imran Nazar <tf@imrannazar.com>, Dec 2010.
]]>The ship had always been furtive while hopping through space: making random jumps like a fly being preyed upon. The engines were growing torpid with old age, jumping less often and not as far, with progress deteriorating by the day. The only way to fix things was to take them offline for refurbishment, which left the ship stationary and open to attack.
The go-ahead had been given, budgeted to two hours: it generally took the Alliance ships at least three to find them at rest stops, so two hours was plenty. Mera shut down the reactor; as its spin slowed, he opened the engine casing and saw the problem immediately.
Some of the waste water had crept into the engines, evaporated with the heat, and left salt deposits encrusted around the antimatter injectors. This would take more than two hours to fix; the ship would have to land on a safe planet for a full overhaul of the injector matrix.
Mera closed up the casing, and fired up the reactor. Nothing happened: the reactor had itself seized with salt. The catch was that the Jump engine needed power from the reactor: without the Jump drive, they'd never get to a planet before Alliance hordes were crawling over them.
Now they were screwed.
]]>This is part 8 of an article series on emulation development in JavaScript; ten parts are currently available, and others are expected to follow.
Please note: This article has been updated to remove an incorrect interrupt handling procedure. --12th Nov, 2010
In the previous part, the foundations for simulating a game were laid, with the introduction of sprites. However, one aspect was missing from the emulator: the vertical blanking interrupt. In this part, interrupts as a whole will be introduced, and the blanking interrupt in particular will be implemented; once this has been done, the emulator will run Tetris.
Imagine that you have a computer with a network card, and some software that processes data from the network. From the perspective of the computer, data only comes in every so often, so you need some way for the software to know that new data has arrived. There are two ways for this to happen:
It's obvious that the concept of interrupts is a useful one, but there are both hardware and software requirements for interrupts to work. In hardware terms, the CPU has to temporarily stop execution of what it's doing when an interrupt arrives, and instead begin execution of an interrupt handler (sometimes referred to as an Interrupt Service Routine). In the above scenario, a wire is run between the network card and the CPU, allowing the card to inform the CPU when data has arrived.
The CPU will check its interrupt inputs at the end of every instruction. If an interrupt signal has been given by some attached peripheral like the network card, steps are taken by the CPU to start the interrupt handler: the CPU will save the location where it left off normal execution, register the fact that the interrupt happened, and jump across to the handler.
In the GameBoy, there are five different interrupt wires, feeding in from the various peripherals. Each one has its own ISR at a different address in memory; the list of interrupts is as follows.
Interrupt | ISR address (hex) |
---|---|
Vertical blank | 0040 |
LCD status triggers | 0048 |
Timer overflow | 0050 |
Serial link | 0058 |
Joypad press | 0060 |
In the case of the vertical blank, a wire is threaded into the bottom of the LCD; as soon as the GPU has finished scanning all the LCD lines and runs into the bottom of the screen, the interrupt fires and the CPU jumps to 0040
, executing the blanking ISR.
Most CPUs contain a "master flag" for interrupts: they will only be handled by the CPU if this flag is enabled. The Z80 in the GameBoy is no exception, but there are additional registers that deal with the individual interrupts available in the GameBoy. These are memory registers, so they are handled by the memory management unit:
Register | Location | Notes | Details | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Interrupt enable |
FFFF |
When bits are set, the corresponding interrupt can be triggered |
|
||||||||||||||||||
Interrupt flags |
FF0F |
When bits are set, an interrupt has happened |
Bits in the same order as FFFF |
Since these are memory registers, their implementation is something for the MMU:
MMU = { _ie: 0, _if: 0, rb: function(addr) { switch(addr & 0xF000) { ... case 0xF000: switch(addr & 0x0F00) { ...// Zero-pagecase 0xF00: if(addr == 0xFFFF) { return MMU._ie; } else if(addr >= 0xFF80) { return MMU._zram[addr & 0x7F]; } else {// I/O control handlingswitch(addr & 0x00F0) { case 0x00: if(addr == 0xFF0F) return MMU._if; break; ... } return 0; } } } }, ... };
The Z80's "master enable" switch is, in a similar manner, something for the Z80 implementation. The CPU provides opcodes for software to flick the master enable into either On or Off position, so these will also need to be implemented:
Z80 = { _r: { ime: 0, ... }, reset: function() { ... Z80._r.ime = 1; },// Disable IMEDI: function() { Z80._r.ime = 0; Z80._r.m = 1; Z80._r.t = 4; },// Enable IMEEI: function() { Z80._r.ime = 1; Z80._r.m = 1; Z80._r.t = 4; } };
With the interrupt flags in place, the main execution loop can be redeveloped, to fall more in line with the execution path from figure 3. After execution, the interrupt flags need checking to see whether an enabled interrupt has occurred; if it has, its handler can be called.
Z80 = { _ops: { ...// Start vblank handler (0040h)RST40: function() {// Disable further interruptsZ80._r.ime = 0;// Save current SP on the stackZ80._r.sp -= 2; MMU.ww(Z80._r.sp, Z80._r.pc);// Jump to handlerZ80._r.pc = 0x0040; Z80._r.m = 3; Z80._r.t = 12; },// Return from interrupt (called by handler)RETI: function() {// Restore interruptsZ80._r.ime = 1;// Jump to the address on the stackZ80._r.pc = MMU.rw(Z80._r.sp); Z80._r.sp += 2; Z80._r.m = 3; Z80._r.t = 12; } } }; while(true) {// Run execute for this instructionvar op = MMU.rc(Z80._r.pc++); Z80._map[op](); Z80._r.pc &= 65535; Z80._clock.m += Z80._r.m; Z80._clock.t += Z80._r.t; Z80._r.m = 0; Z80._r.t = 0;// If IME is on, and some interrupts are enabled in IE, and // an interrupt flag is set, handle the interruptif(Z80._r.ime && MMU._ie && MMU._if) {// Mask off ints that aren't enabledvar ifired = MMU._ie & MMU._if; if(ifired & 0x01) { MMU._if &= (255 - 0x01); Z80._ops.RST40(); } } Z80._clock.m += Z80._r.m; Z80._clock.t += Z80._r.t; }
As shown in Figure 1, the emulator has reached a reasonable stage: it's able to emulate a released game in at least some form. It does, however, have the problem of game size. Tetris is a 32kB ROM, and fits perfectly into the "ROM" space in the memory map. Games tend to have larger ROMs than this, and the cartridge follows a process of mapping portions of the ROM into memory. Next time, I'll look at the simplest available form of ROM mapping for the GameBoy, and its implementation on a 64kB game ROM.
Imran Nazar <tf@imrannazar.com>, Nov 2010.
]]>This is part 7 of an article series on emulation development in JavaScript; ten parts are currently available, and others are expected to follow.
Previously in this series, the emulator was extended to enable keypad input, which meant that a game of tic-tac-toe could be played. The problem left by this was that the game had to be played blind: there was no indication of where the next move would be made, nor of to where on the game a keypress would move you. Traditionally, two-dimensional gaming consoles have solved this issue through the use of sprites: movable object blocks that can be placed independently of the background, and which contain data separate to that of the background.
The GameBoy is no exception in this regard: it provides for sprites to be placed above or below the background, and multiple sprites to be on screen at the same time. Once this has been implemented in the emulator, the tic-tac-toe game runs as below.
GameBoy sprites are graphic tiles, just like those used for the background: this means that each sprite is 8x8 pixels. As stated above, a sprite can be placed anywhere on the screen, including halfway or all the way off-screen, and it can be placed above or below the background. What this means technically is that sprites below the background show through where the background has colour value 0.
In the above figure, the sprite above the background shows the background through the middle of it, since these pixels in the sprite are set to colour 0; in the same way, the background lets through the sprites below it where the background colour is 0. In order to simulate this in an emulator, the simplest procedure would be to render the sprites below the background, then the background itself, and finally the sprites above it. However, this is a somewhat naive algorithm, since it duplicates the sprite rendering process; it's simpler instead to draw the background first, then work out whether a given pixel in the sprite should appear based on its priority and the background colour at that position.
For each row in sprite If this row is on screen For each pixel in row If this pixel is on screen If this pixel is transparent* Do nothingElse If the sprite has priority Draw pixel Else if this pixel in the background is 0 Draw pixel Else* Do nothingEnd If End If End If End For End If End For
One additional complication to the GameBoy sprite system is that a sprite can be "flipped" horizontally or vertically by the hardware, at the time it's rendered; this saves space in the game, since (for example) a spaceship flying backwards can be represented by the same sprite as forward motion, with the appropriate flip applied.
The GameBoy can hold information about 40 sprites, in a dedicated region of memory called Object Attribute Memory (OAM). Each of the 40 sprites has four bytes of data in the OAM associated with it, as detailed below.
Byte | Description | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Y-coordinate of top-left corner (Value stored is Y-coordinate minus 16) | ||||||||||||||||||||
1 | X-coordinate of top-left corner (Value stored is X-coordinate minus 8) | ||||||||||||||||||||
2 | Data tile number | ||||||||||||||||||||
3 | Options
|
In order to more easily access this information when it comes to rendering a scanline, it's useful to build a structure to hold the sprite data, which is filled in based on the contents of the OAM. When data is written to the OAM, the MMU in consort with the graphics emulation can update this structure for later use. An implementation of this would be as follows.
rb: function(addr) { switch(addr & 0xF000) { ... case 0xF000: switch(addr & 0x0F00) { ...// OAMcase 0xE00: return (addr < 0xFEA0) ? GPU._oam[addr & 0xFF] : 0; } } }, wb: function(addr) { switch(addr & 0xF000) { ... case 0xF000: switch(addr & 0x0F00) { ...// OAMcase 0xE00: if(addr < 0xFEA0) GPU._oam[addr & 0xFF] = val; GPU.buildobjdata(addr - 0xFE00, val); break; } } }
_oam: [], _objdata: [], reset: function() {// In addition to previous reset code:for(var i=0, n=0; i < 40; i++, n+=4) { GPU._oam[n + 0] = 0; GPU._oam[n + 1] = 0; GPU._oam[n + 2] = 0; GPU._oam[n + 3] = 0; GPU._objdata[i] = { 'y': -16, 'x': -8, 'tile': 0, 'palette': 0, 'xflip': 0, 'yflip': 0, 'prio': 0, 'num': i }; } }, buildobjdata: function(addr, val) { var obj = addr >> 2; if(obj < 40) { switch(addr & 3) {// Y-coordinatecase 0: GPU._objdata[obj].y = val-16; break;// X-coordinatecase 1: GPU._objdata[obj].x = val-8; break;// Data tilecase 2: GPU._objdata[obj].tile = val; break;// Optionscase 3: GPU._objdata[obj].palette = (val & 0x10) ? 1 : 0; GPU._objdata[obj].xflip = (val & 0x20) ? 1 : 0; GPU._objdata[obj].yflip = (val & 0x40) ? 1 : 0; GPU._objdata[obj].prio = (val & 0x80) ? 1 : 0; break; } } }
As hinted above, the GPU offers a choice of two palettes for the sprites: each of the 40 sprites can use one of the two palettes, as specified in its OAM entry. These object palettes are stored in the GPU, in addition to the background palette, and can be changed through I/O registers in much the same manner as the palette for the background.
_pal: { bg: [], obj0: [], obj1: [] }, wb: function(addr) { switch(addr) {// ...// Background palettecase 0xFF47: for(var i = 0; i < 4; i++) { switch((val >> (i * 2)) & 3) { case 0: GPU._pal.bg[i] = [255,255,255,255]; break; case 1: GPU._pal.bg[i] = [192,192,192,255]; break; case 2: GPU._pal.bg[i] = [ 96, 96, 96,255]; break; case 3: GPU._pal.bg[i] = [ 0, 0, 0,255]; break; } } break;// Object palettescase 0xFF48: for(var i = 0; i < 4; i++) { switch((val >> (i * 2)) & 3) { case 0: GPU._pal.obj0[i] = [255,255,255,255]; break; case 1: GPU._pal.obj0[i] = [192,192,192,255]; break; case 2: GPU._pal.obj0[i] = [ 96, 96, 96,255]; break; case 3: GPU._pal.obj0[i] = [ 0, 0, 0,255]; break; } } break; case 0xFF49: for(var i = 0; i < 4; i++) { switch((val >> (i * 2)) & 3) { case 0: GPU._pal.obj1[i] = [255,255,255,255]; break; case 1: GPU._pal.obj1[i] = [192,192,192,255]; break; case 2: GPU._pal.obj1[i] = [ 96, 96, 96,255]; break; case 3: GPU._pal.obj1[i] = [ 0, 0, 0,255]; break; } } break; } }
The GameBoy graphics system renders each line of the screen as it's encountered: this includes not only the background, but the sprites below and above it. In other words, rendering of the sprites must be added to the scanline renderer, as a process that occurs after drawing the background. Just as with the background, there's a switch to enable sprites within the LCDC register, and this must be added to the I/O handling for the GPU.
Since a sprite can be anywhere on the screen, including positioned somewhere off-screen, the renderer has to check which sprites are positioned within the current scanline. The simplest algorithm for this is to check the position of each one, and render the appropriate line of the sprite if it falls within the bounds of the scanline. The sprite data can be retrieved in the same way as it is for the background, through the pre-calculated tile set. An example of these things brought together is as follows.
renderscan: function() {// Scanline data, for use by sprite renderervar scanrow = [];// Render background if it's switched onif(GPU._switchbg) { var mapoffs = GPU._bgmap ? 0x1C00 : 0x1800; mapoffs += ((GPU._line + GPU._scy) & 255) >> 3; var lineoffs = (GPU._scx >> 3); var y = (GPU._line + GPU._scy) & 7; var x = GPU._scx & 7; var canvasoffs = GPU._line * 160 * 4; var colour; var tile = GPU._vram[mapoffs + lineoffs];// If the tile data set in use is #1, the // indices are signed; calculate a real tile offsetif(GPU._bgtile == 1 && tile < 128) tile += 256; for(var i = 0; i < 160; i++) {// Re-map the tile pixel through the palettecolour = GPU._pal.bg[GPU._tileset[tile][y][x]];// Plot the pixel to canvasGPU._scrn.data[canvasoffs+0] = colour[0]; GPU._scrn.data[canvasoffs+1] = colour[1]; GPU._scrn.data[canvasoffs+2] = colour[2]; GPU._scrn.data[canvasoffs+3] = colour[3]; canvasoffs += 4;// Store the pixel for later checkingscanrow[i] = GPU._tileset[tile][y][x];// When this tile ends, read anotherx++; if(x == 8) { x = 0; lineoffs = (lineoffs + 1) & 31; tile = GPU._vram[mapoffs + lineoffs]; if(GPU._bgtile == 1 && tile < 128) tile += 256; } } }// Render sprites if they're switched onif(GPU._switchobj) { for(var i = 0; i < 40; i++) { var obj = GPU._objdata[i];// Check if this sprite falls on this scanlineif(obj.y <= GPU._line && (obj.y + 8) > GPU._line) {// Palette to use for this spritevar pal = obj.pal ? GPU._pal.obj1 : GPU._pal.obj0;// Where to render on the canvasvar canvasoffs = (GPU._line * 160 + obj.x) * 4;// Data for this line of the spritevar tilerow;// If the sprite is Y-flipped, // use the opposite side of the tileif(obj.yflip) { tilerow = GPU._tileset[obj.tile] [7 - (GPU._line - obj.y)]; } else { tilerow = GPU._tileset[obj.tile] [GPU._line - obj.y]; } var colour; var x; for(var x = 0; x < 8; x++) {// If this pixel is still on-screen, AND // if it's not colour 0 (transparent), AND // if this sprite has priority OR shows under the bg // then render the pixelif((obj.x + x) >= 0 && (obj.x + x) < 160 && tilerow[x] && (obj.prio || !scanrow[obj.x + x])) {// If the sprite is X-flipped, // write pixels in reverse ordercolour = pal[tilerow[obj.xflip ? (7-x) : x]]; GPU._scrn.data[canvasoffs+0] = colour[0]; GPU._scrn.data[canvasoffs+1] = colour[1]; GPU._scrn.data[canvasoffs+2] = colour[2]; GPU._scrn.data[canvasoffs+3] = colour[3]; canvasoffs += 4; } } } } } }, rb: function(addr) { switch(addr) {// LCD Controlcase 0xFF40: return (GPU._switchbg ? 0x01 : 0x00) | (GPU._switchobj ? 0x02 : 0x00) | (GPU._bgmap ? 0x08 : 0x00) | (GPU._bgtile ? 0x10 : 0x00) | (GPU._switchlcd ? 0x80 : 0x00);// ...} }, wb: function(addr, val) { switch(addr) {// LCD Controlcase 0xFF40: GPU._switchbg = (val & 0x01) ? 1 : 0; GPU._switchobj = (val & 0x02) ? 1 : 0; GPU._bgmap = (val & 0x08) ? 1 : 0; GPU._bgtile = (val & 0x10) ? 1 : 0; GPU._switchlcd = (val & 0x80) ? 1 : 0; break;// ...} }
With sprites in place, basic games like the tic-tac-toe running in Figure 1 can work in full. Many games, however, will not run without something else: a method of determining when the screen can be redrawn. Almost every game will perform a "refresh" of the screen data while the screen is in vertical blanking, since changes to the screen won't show up until the next time the GPU comes to draw a frame.
Basic games and demos sometimes do this by checking whether the GPU has hit line #144 in its redrawing process, but this takes up a lot of processing power in repeated looping. The more common method is for the game to be informed when an event has occurred: this message is referred to as an interrupt. In the next part, I'll take a look at the vertical blanking interrupt in particular, and how it can be simulated to provide this message passing process to an emulated game.
Imran Nazar <tf@imrannazar.com>, Oct 2010.
]]>This is part 6 of an article series on emulation development in JavaScript; ten parts are currently available, and others are expected to follow.
With a working emulator and interface developed over the previous five parts, the emulation system is able to run a basic test ROM, and produce graphical output. What the emulator is currently unable to do is take keypresses as keypad input, and feed them through to the ROM under test; in order for this to be done, the keypad's influence on the I/O registers must be emulated.
With the addition of keypad I/O, the emulator runs as follows.
The GameBoy has a single method of input, an eight-key pad out of which any number of keys can be depressed. With most keyboards, the keys are laid out in a grid of columns and rows: these can be treated as wires, between which a key can form a connection. When one of the columns is activated, any rows connected to that column will also activate, and the hardware is able to detect the active rows to determine the currently pressed keys.
With the GameBoy, the keyboard grid has two columns and four rows, which has the advantage that all the required connections can be made within one 8-bit I/O register.
Since all six lines are tied to the same register, the GameBoy procedure for reading a key is slightly convoluted:
Writing code to simulate keypad presses is relatively simple, but two factors complicate the issue: allowing for a column to be set in the grid before rows are read, and the keypress codes that are used by JavaScript. In order to accommodate the two columns, two values must be used by the emulation, each of which holds the intersections between that column and the rows. One additional factor to take into account is that the values are reversed for the keypad: a row is left at high voltage by default, and is dropped to zero voltage when it intersects a column. This is interpreted by the I/O register as the row bits being 1 for no key pressed, and 0 for a keypress.
The JavaScript keydown
and keyup
events can be used to find out when a key has been pressed or released; tying these into the keypad handler can be done in the following manner.
KEY = { _rows: [0x0F, 0x0F], _column: 0, reset: function() { KEY._rows = [0x0F, 0x0F]; KEY._column = 0; }, rb: function(addr) { switch(KEY._column) { case 0x10: return KEY._rows[0]; case 0x20: return KEY._rows[1]; default: return 0; } }, wb: function(addr, val) { KEY._column = val & 0x30; }, kdown: function(e) {// Reset the appropriate bit}, kup: function(e) {// Set the appropriate bit} }; window.onkeydown = KEY.kdown; window.onkeyup = KEY.kup;
In addition to this, the MMU must be extended to handle the keypad I/O register, with an addition to the zero-page handling routines; an example of this is given below.
rb: function(addr) { switch(addr & 0xF000) { ... case 0xF000: switch(addr & 0x0F00) { ...// Zero-pagecase 0xF00: if(addr >= 0xFF80) { return MMU._zram[addr & 0x7F]; } else if(addr >= 0xFF40) {// GPU (64 registers)return GPU.rb(addr); } else switch(addr & 0x3F) { case 0x00: return KEY.rb(); default: return 0; } } } }
With the keypad handler plumbed in, the remaining issue is the handling of keypresses, and the ability of the keypad code to distinguish between different keys being pressed. This can be done through the JavaScript event
object; any event that runs through the browser, such as a mouse click or a keypress, will be passed to the code if it's requested, along with an object that describes the event that's just occurred. In the case of a keypress, the event
object contains a character code and a "key scan" code, which both describe the key in question.
Through testing by Peter-Paul Koch, it has been determined that the character code passed by browsers to JavaScript code is unreliable, and will change depending on which browser is used. The only case on which all browsers agree is the key-scan code produced for keyup
and keydown
events; in any browser, pressing a given key will yield a particular value.
For the purposes of this emulator, eight keys need to be handled by the keypad code:
Scan code | Key | Mapping |
---|---|---|
13 | Enter | Start |
32 | Space | Select |
37 | Left arrow | Left |
38 | Up arrow | Up |
39 | Right arrow | Right |
40 | Down arrow | Down |
88 | X | B |
90 | Z | A |
As stated above, the appropriate bits must be reset when a key is pressed, and set when the key is released. This can be implemented as follows.
kdown: function(e) { switch(e.keyCode) { case 39: KEY._keys[1] &= 0xE; break; case 37: KEY._keys[1] &= 0xD; break; case 38: KEY._keys[1] &= 0xB; break; case 40: KEY._keys[1] &= 0x7; break; case 90: KEY._keys[0] &= 0xE; break; case 88: KEY._keys[0] &= 0xD; break; case 32: KEY._keys[0] &= 0xB; break; case 13: KEY._keys[0] &= 0x7; break; } }, kup: function(e) { switch(e.keyCode) { case 39: KEY._keys[1] |= 0x1; break; case 37: KEY._keys[1] |= 0x2; break; case 38: KEY._keys[1] |= 0x4; break; case 40: KEY._keys[1] |= 0x8; break; case 90: KEY._keys[0] |= 0x1; break; case 88: KEY._keys[0] |= 0x2; break; case 32: KEY._keys[0] |= 0x4; break; case 13: KEY._keys[0] |= 0x8; break; } }
Figure 1 above shows the result of these additions to the emulator, when running a basic tic-tac-toe game. In this example, the initial screen can be advanced to the credits by pressing the Start key, which is mapped to Enter by this emulator. Another press of the Start key will bring up the game screen, and the game can be played with the player as one side, and the computer as the other; pressing the GameBoy's A key (mapped to Z) will place a cross or circle on behalf of the player.
Right now, the game must be played blind, since there is no indicator of where the player places a mark. The game produces this indicator by using a sprite: a tile which can be placed by the graphics chip above the background, and moved independently. Most games produce their gameplay through use of sprites, so building them into the simulation is an important next step for this series. Next time, I'll be taking a look at the facilities provided by the GameBoy for the rendering of sprites, and how they can be implemented in JavaScript.
Imran Nazar <tf@imrannazar.com>, Sep 2010.
]]>This is part 5 of an article series on emulation development in JavaScript; ten parts are currently available, and others are expected to follow.
In part 4, the GameBoy's graphics subsystem was explored in detail, and an emulation put together. Without a set of register mappings for the GPU to be dealt with in software, the graphics subsystem cannot be used by the emulator; once these registers have been made available, the emulator is essentially ready for basic use.
With the additions detailed below to add the GPU registers, and a basic interface for the control of the emulator, the result is as follows.
The graphics unit of the GameBoy has a series of registers which are mapped into memory, in the I/O space of the memory map. In order to get a working emulation with a background image, the following registers will be needed by the GPU (other registers are also available to the GPU, and will be explored in later parts of this series).
Address | Register | Status |
---|---|---|
0xFF40 | LCD and GPU control | Read/write |
0xFF42 | Scroll-Y | Read/write |
0xFF43 | Scroll-X | Read/write |
0xFF44 | Current scan line | Read only |
0xFF47 | Background palette | Write only |
The background palette register has previously been explored, and consists of four 2-bit palette entries. The scroll registers and scanline counter are full-byte values; this leaves the LCD control register, which is made up of 8 separate flags controlling the sections of the GPU.
Bit | Function | When 0 | When 1 |
---|---|---|---|
0 | Background: on/off | Off | On |
1 | Sprites: on/off | Off | On |
2 | Sprites: size (pixels) | 8x8 | 8x16 |
3 | Background: tile map | #0 | #1 |
4 | Background: tile set | #0 | #1 |
5 | Window: on/off | Off | On |
6 | Window: tile map | #0 | #1 |
7 | Display: on/off | Off | On |
In the above table, the additional features of the GPU appear: a "window" layer which can appear above the background, and sprite objects which can be moved against the background and window. These additional features will be covered as the need for them arises; in the meantime, the background flags are most important for basic rendering functions. In particular, it can be seen here how the background tile map and tile set can be changed, simply by flipping bits in the register 0xFF40
.
Armed with the conceptual GPU register layout, an emulation can be implemented simply by adding handlers for these addresses to the MMU. This can either be done by hard-coding the GPU updates into the MMU, or defining a range of registers wherein the GPU will be called from the MMU, for more specialised handling to be done from there. In the interests of modularity, the latter approach has been taken here.
rb: function(addr) { switch(addr & 0xF000) { ... case 0xF000: switch(addr & 0x0F00) { ...// Zero-pagecase 0xF00: if(addr >= 0xFF80) { return MMU._zram[addr & 0x7F]; } else {// I/O control handlingswitch(addr & 0x00F0) {// GPU (64 registers)case 0x40: case 0x50: case 0x60: case 0x70: return GPU.rb(addr); } return 0; } } } }, wb: function(addr, val) { switch(addr & 0xF000) { ... case 0xF000: switch(addr & 0x0F00) { ...// Zero-pagecase 0xF00: if(addr >= 0xFF80) { MMU._zram[addr & 0x7F] = val; } else {// I/Oswitch(addr & 0x00F0) {// GPUcase 0x40: case 0x50: case 0x60: case 0x70: GPU.wb(addr, val); break; } } break; } break; } }
rb: function(addr) { switch(addr) {// LCD Controlcase 0xFF40: return (GPU._switchbg ? 0x01 : 0x00) | (GPU._bgmap ? 0x08 : 0x00) | (GPU._bgtile ? 0x10 : 0x00) | (GPU._switchlcd ? 0x80 : 0x00);// Scroll Ycase 0xFF42: return GPU._scy;// Scroll Xcase 0xFF43: return GPU._scx;// Current scanlinecase 0xFF44: return GPU._line; } }, wb: function(addr, val) { switch(addr) {// LCD Controlcase 0xFF40: GPU._switchbg = (val & 0x01) ? 1 : 0; GPU._bgmap = (val & 0x08) ? 1 : 0; GPU._bgtile = (val & 0x10) ? 1 : 0; GPU._switchlcd = (val & 0x80) ? 1 : 0; break;// Scroll Ycase 0xFF42: GPU._scy = val; break;// Scroll Xcase 0xFF43: GPU._scx = val; break;// Background palettecase 0xFF47: for(var i = 0; i < 4; i++) { switch((val >> (i * 2)) & 3) { case 0: GPU._pal[i] = [255,255,255,255]; break; case 1: GPU._pal[i] = [192,192,192,255]; break; case 2: GPU._pal[i] = [ 96, 96, 96,255]; break; case 3: GPU._pal[i] = [ 0, 0, 0,255]; break; } } break; } }
At present, the dispatch loop for the emulator's CPU runs forever, without pause. The most basic interface for an emulator allows for the simulation to be reset or paused; in order to allow for this, a known amount of time must be used as the base unit of the emulator interface. There are three possible units of time that can be used for this:
Since a frame is made of 144 scanlines and a 10-line vertical blank, and each scanline takes 456 clock cycles to run, the length of a frame is 70224 clocks. In conjunction with an emulator-level reset function, which initialises each subsystem at the start of the emulation, the emulator itself can be run, and a rudimentary interface provided.
<canvas id="screen" width="160" height="144"></canvas> <a id="reset">Reset</a> | <a id="run">Run</a>
jsGB = { reset: function() { GPU.reset(); MMU.reset(); Z80.reset(); MMU.load('test.gb'); }, frame: function() { var fclk = Z80._clock.t + 70224; do { Z80._map[MMU.rb(Z80._r.pc++)](); Z80._r.pc &= 65535; Z80._clock.m += Z80._r.m; Z80._clock.t += Z80._r.t; GPU.step(); } while(Z80._clock.t < fclk); }, _interval: null, run: function() { if(!jsGB._interval) { jsGB._interval = setTimeout(jsGB.frame, 1); document.getElementById('run').innerHTML = 'Pause'; } else { clearInterval(jsGB._interval); jsGB._interval = null; document.getElementById('run').innerHTML = 'Run'; } } }; window.onload = function() { document.getElementById('reset').onclick = jsGB.reset; document.getElementById('run').onclick = jsGB.run; jsGB.reset(); };
Previously shown in Figure 1 is the result of bringing this code together: the emulator is capable of loading and running a graphics-based demo. In this case, the test ROM being loaded is a scrolling test written by Doug Lanford: the background displayed will scroll when one of the directional keypad buttons is pressed. In this particular case, with the keypad un-emulated, a static background is displayed.
In the next part, this piece of the jigsaw will be put in place: a keypad simulation which can provide the appropriate inputs to the emulated program. I'll also be looking at how the keypad works, and how the inputs are mapped into memory.
Imran Nazar <tf@imrannazar.com>, Sep 2010.
]]>Previously in this series, the shape of a GameBoy emulator was brought together, and the timings established between the CPU and graphics processor. A canvas has been initialised and is ready for graphics to be drawn by the emulated GameBoy; the GPU emulation now has structure, but is still unable to render graphics to the framebuffer. In order for the emulation to render graphics, the concepts behind GameBoy graphics must be briefly examined.
Just like most consoles of the era, the GameBoy didn't have enough memory to allow for a direct framebuffer to be held in memory. Instead, a tile system is employed: a set of small bitmaps is held in memory, and a map is built using references to these bitmaps. The innate advantage to this system is that one tile can be used repeatedly through the map, simply by using its reference.
The GameBoy's tiled graphics system operates with tiles of 8x8 pixels, and 256 unique tiles can be used in a map; there are two maps of 32x32 tiles that can be held in memory, and one of them can be used for the display at a time. There is space in the GameBoy memory for 384 tiles, so half of them are shared between the maps: one map uses tile numbers from 0 to 255, and the other uses numbers between -128 and 127 for its tiles.
In video memory, the layout of the tile data and maps runs as follows.
Region | Usage |
---|---|
8000-87FF | Tile set #1: tiles 0-127 |
8800-8FFF | Tile set #1: tiles 128-255 Tile set #0: tiles -1 to -128 |
9000-97FF | Tile set #0: tiles 0-127 |
9800-9BFF | Tile map #0 |
9C00-9FFF | Tile map #1 |
When a background is defined, its map and tile data interact to produce the graphical display:
The background map is, as previously mentioned, 32x32 tiles; this comes to 256 by 256 pixels. The display of the GameBoy is 160x144 pixels, so there's scope for the background to be moved relative to the screeen. The GPU achieves this by defining a point in the background that corresponds to the top-left of the screen: by moving this point between frames, the background is made to scroll on the screen. For this reason, the definition of the top-left corner is held by two GPU registers: Scroll X and Scroll Y.
The GameBoy is often described as a monochrome machine, capable of displaying only black and white. This isn't quite true: the GameBoy can also handle light and dark grey, for a total of four colours. Representing one of these four colours in the tile data takes two bits, so each tile in the tile data set is held in (8x8x2) bits, or 16 bytes.
One additional complication for the GameBoy background is that a palette is intersticed between the tile data and the final display: each of the four possible values for a tile pixel can correspond to any of the four colours. This is used mainly to allow easy colour changes for the tile set; if, for example, a set of tiles is held corresponding to the English alphabet, an inverse-video version can be built by changing the palette, instead of taking up another part of the tile set. The four palette entries are all updated at once, by changing the value of the Background Palette GPU register; the colour references used, and the structure of the register, are shown below.
Value | Pixel | Emulated colour |
---|---|---|
0 | Off | [255, 255, 255] |
1 | 33% on | [192, 192, 192] |
2 | 66% on | [96, 96, 96] |
3 | On | [0, 0, 0] |
As stated above, each pixel in the tile data set is represented by two bits: these bits are read by the GPU when the tile is referenced in the map, run through the palette and pushed to screen. The hardware of the GPU is wired such that one whole row of the tile is accessible at the same time, and the pixels are cycled through by running up the bits. The only issue with this is that one row of the tile is two bytes: from this results the slightly convoluted scheme for storage of the bits, where each pixel's low bit is held in one byte, and the high bit in the other byte.
Since JavaScript isn't ideally suited for manipulating bitmap structures quickly, the most time-efficient way of handling the tile data set is to maintain an internal data set alongside the video memory, with a more expanded view where each pixel's value has been pre-calculated. In order for this to accurately reflect the tile data set, any writes to the video RAM must trigger the function to update the GPU's internal tile data.
_tileset: [], reset: function() {// In addition to previous reset code:GPU._tileset = []; for(var i = 0; i < 384; i++) { GPU._tileset[i] = []; for(var j = 0; j < 8; j++) { GPU._tileset[i][j] = [0,0,0,0,0,0,0,0]; } } },// Takes a value written to VRAM, and updates the // internal tile data setupdatetile: function(addr, val) {// Get the "base address" for this tile rowaddr &= 0x1FFE;// Work out which tile and row was updatedvar tile = (addr >> 4) & 511; var y = (addr >> 1) & 7; var sx; for(var x = 0; x < 8; x++) {// Find bit index for this pixelsx = 1 << (7-x);// Update tile setGPU._tileset[tile][y][x] = ((GPU._vram[addr] & sx) ? 1 : 0) + ((GPU._vram[addr+1] & sx) ? 2 : 0); } }
wb: function(addr, val) { switch(addr & 0xF000) {// Only the VRAM case is shown:case 0x8000: case 0x9000: GPU._vram[addr & 0x1FFF] = val; GPU.updatetile(addr, val); break; } }
With these pieces in place, it's possible to begin rendering the GameBoy screen. Since this is being done on a line-by-line basis, the renderscan
function referred to in Part 3 must, before it renders a scanline, work out where it is on the screen. This involves calculating the X and Y coordinates of the position in the background map, using the scroll registers and the current scanline counter. Once this has been determined, the scan renderer can advance through each tile in that row of the map, pulling in new tile data as it encounters each tile.
renderscan: function() {// VRAM offset for the tile mapvar mapoffs = GPU._bgmap ? 0x1C00 : 0x1800;// Which line of tiles to use in the mapmapoffs += ((GPU._line + GPU._scy) & 255) >> 3;// Which tile to start with in the map linevar lineoffs = (GPU._scx >> 3);// Which line of pixels to use in the tilesvar y = (GPU._line + GPU._scy) & 7;// Where in the tileline to startvar x = GPU._scx & 7;// Where to render on the canvasvar canvasoffs = GPU._line * 160 * 4;// Read tile index from the background mapvar colour; var tile = GPU._vram[mapoffs + lineoffs];// If the tile data set in use is #1, the // indices are signed; calculate a real tile offsetif(GPU._bgtile == 1 && tile < 128) tile += 256; for(var i = 0; i < 160; i++) {// Re-map the tile pixel through the palettecolour = GPU._pal[GPU._tileset[tile][y][x]];// Plot the pixel to canvasGPU._scrn.data[canvasoffs+0] = colour[0]; GPU._scrn.data[canvasoffs+1] = colour[1]; GPU._scrn.data[canvasoffs+2] = colour[2]; GPU._scrn.data[canvasoffs+3] = colour[3]; canvasoffs += 4;// When this tile ends, read anotherx++; if(x == 8) { x = 0; lineoffs = (lineoffs + 1) & 31; tile = GPU._vram[mapoffs + lineoffs]; if(GPU._bgtile == 1 && tile < 128) tile += 256; } } }
With a CPU, memory handling and a graphics subsystem, the emulator is nearly capable of producing output. In part 5, I'll be looking at what's required to get the system from a disparate set of module files to a coherent whole, capable of loading and running a simple ROM file: tying the graphics registers to the MMU, and a simple interface to control the running of the emulation.
Imran Nazar <tf@imrannazar.com>, Aug 2010.
]]>In the previous parts of this series, a structure for a GameBoy emulator was laid out, and brought to the point where a game ROM could be loaded, and stepped through by the emulated CPU. With the emulated processor attached to a memory mapping structure, it's now possible to attach peripherals to the system. One of the primary peripherals used by the GameBoy, and by any games console, is the graphics processor (GPU): it's the primary method of output for the console, and much of the processor's work goes on generating graphics for the GPU.
Nintendo's internal name for the GameBoy is "Dot Matrix Game"; its display is a pixel LCD of dimensions 160x144. If each pixel in the LCD is treated as a pixel in a HTML5 <canvas>
, a direct mapping can be made to a canvas of width 160 and height 144. In order to directly address each pixel in the LCD, the contents of the canvas can be manipulated as a "framebuffer": a single block of memory containing the entirety of the canvas, as a series of 4-byte RGBA values.
<canvas id="screen" width="160" height="144"></canvas>
GPU = { _canvas: {}, _scrn: {}, reset: function() { var c = document.getElementById('screen'); if(c && c.getContext) { GPU._canvas = c.getContext('2d'); if(GPU._canvas) { if(GPU._canvas.createImageData) GPU._scrn = GPU._canvas.createImageData(160, 144); else if(GPU._canvas.getImageData) GPU._scrn = GPU._canvas.getImageData(0,0, 160,144); else GPU._scrn = { 'width': 160, 'height': 144, 'data': new Array(160*144*4) };// Initialise canvas to whitefor(var i=0; i<160*144*4; i++) GPU._scrn.data[i] = 255; GPU._canvas.putImageData(GPU._scrn, 0, 0); } } } }
Once a block of memory has been allocated for the screen data, an individual pixel's colour can be set by writing RGBA components to the four values at that pixel position in the block; the pixel position can be determined by the formula y * 160 + x
.
With a canvas in place to receive the graphic output of the GameBoy, the next step is to emulate the production of graphics. The original GameBoy hardware simulates a cathode-ray tube (CRT) in its timings: in a CRT, the screen is scanned in rows by an electron beam, and the scanning process returns to the top of the screen after the end of scanning.
As can be seen above, a CRT requires more time to draw a scanline than simply running over the pixels in question: a "horizontal blanking" period is needed, for the beam to move from the end of one line to the start of the next. Similarly, the end of each frame means a "vertical blanking" period, while the beam travels back to the top-left corner. Since the beam has to move further in vertical blanking, this time period is commonly much longer than the horizontal blanking time.
In the same way, a GameBoy display exhibits horizontal and vertical blanking periods. In addition, time spent within the scanline itself is separated into two parts: the GPU flips between accessing video memory, and accessing sprite attribute memory, while it draws the scanline. For the purpose of this emulation, these two parts are distinct, and follow each other. The following table states how long the GPU stays in each period, in terms of the CPU's T-clock which runs at 4194304 Hz.
Period | GPU mode number | Time spent (clocks) |
---|---|---|
Scanline (accessing OAM) | 2 | 80 |
Scanline (accessing VRAM) | 3 | 172 |
Horizontal blank | 0 | 204 |
One line (scan and blank) | 456 | |
Vertical blank | 1 | 4560 (10 lines) |
Full frame (scans and vblank) | 70224 |
In order to maintain these timings relative to the emulated CPU, a timing update function must exist, which gets called after the execution of every instruction. This can be done from an expanded version of the CPU dispatch process, covered in part 1.
while(true) { Z80._map[MMU.rb(Z80._r.pc++)](); Z80._r.pc &= 65535; Z80._clock.m += Z80._r.m; Z80._clock.t += Z80._r.t; GPU.step(); }
_mode: 0, _modeclock: 0, _line: 0, step: function() { GPU._modeclock += Z80._r.t; switch(GPU._mode) {// OAM read mode, scanline activecase 2: if(GPU._modeclock >= 80) {// Enter scanline mode 3GPU._modeclock = 0; GPU._mode = 3; } break;// VRAM read mode, scanline active // Treat end of mode 3 as end of scanlinecase 3: if(GPU._modeclock >= 172) {// Enter hblankGPU._modeclock = 0; GPU._mode = 0;// Write a scanline to the framebufferGPU.renderscan(); } break;// Hblank // After the last hblank, push the screen data to canvascase 0: if(GPU._modeclock >= 204) { GPU._modeclock = 0; GPU._line++; if(GPU._line == 143) {// Enter vblankGPU._mode = 1; GPU._canvas.putImageData(GPU._scrn, 0, 0); } else { GPU._mode = 2; } } break;// Vblank (10 lines)case 1: if(GPU._modeclock >= 456) { GPU._modeclock = 0; GPU._line++; if(GPU._line > 153) {// Restart scanning modesGPU._mode = 2; GPU._line = 0; } } break; } }
In the above code, the timings for the GPU are established, but the work of the GPU isn't yet in place: renderscan
is where the work happens. In the next part of this series, the concepts behind the GameBoy's background graphics system will be looked at, and code will be put inside the rendering function to emulate them.
Imran Nazar <tf@imrannazar.com>, Aug 2010.
]]>In the previous part of this series, the computer was introduced as a processing unit, which fetches its instructions from memory. In almost every case, a computer's memory is not a simple contiguous region; the GameBoy is no exception in this regard. Since the GameBoy CPU can access 65,536 individual locations on its address bus, a "memory map" can be drawn of all the regions where the CPU has access.
A more detailed look at the memory regions is as follows:
0000h
, which is the start of the 256-byte GameBoy BIOS code. Once the BIOS has run, it is removed from the memory map, and this area of the cartridge rom becomes addressable.In order for the emulated CPU to access these regions separately, each must be handled as a special case in the memory management unit. This part of the code was alluded to in the previous part, and a basic interface described for the MMU object; the fleshing out of the interface can be as simple as a switch
statement.
MMU = {// Flag indicating BIOS is mapped in // BIOS is unmapped with the first instruction above 0x00FF_inbios: 1,// Memory regions (initialised at reset time)_bios: [], _rom: [], _wram: [], _eram: [], _zram: [],// Read a byte from memoryrb: function(addr) { switch(addr & 0xF000) {// BIOS (256b)/ROM0case 0x0000: if(MMU._inbios) { if(addr < 0x0100) return MMU._bios[addr]; else if(Z80._r.pc == 0x0100) MMU._inbios = 0; } return MMU._rom[addr];// ROM0case 0x1000: case 0x2000: case 0x3000: return MMU._rom[addr];// ROM1 (unbanked) (16k)case 0x4000: case 0x5000: case 0x6000: case 0x7000: return MMU._rom[addr];// Graphics: VRAM (8k)case 0x8000: case 0x9000: return GPU._vram[addr & 0x1FFF];// External RAM (8k)case 0xA000: case 0xB000: return MMU._eram[addr & 0x1FFF];// Working RAM (8k)case 0xC000: case 0xD000: return MMU._wram[addr & 0x1FFF];// Working RAM shadowcase 0xE000: return MMU._wram[addr & 0x1FFF];// Working RAM shadow, I/O, Zero-page RAMcase 0xF000: switch(addr & 0x0F00) {// Working RAM shadowcase 0x000: case 0x100: case 0x200: case 0x300: case 0x400: case 0x500: case 0x600: case 0x700: case 0x800: case 0x900: case 0xA00: case 0xB00: case 0xC00: case 0xD00: return MMU._wram[addr & 0x1FFF];// Graphics: object attribute memory // OAM is 160 bytes, remaining bytes read as 0case 0xE00: if(addr < 0xFEA0) return GPU._oam[addr & 0xFF]; else return 0;// Zero-pagecase 0xF00: if(addr >= 0xFF80) { return MMU._zram[addr & 0x7F]; } else {// I/O control handling // Currently unhandledreturn 0; } } } },Read a 16-bit wordrw: function(addr) { return MMU.rb(addr) + (MMU.rb(addr+1) << 8); } };
In the above section of code, it should be noted that the region of memory between 0xFF00
and 0xFF7F
is unhandled; these locations are used as memory-mapped I/O for the various chips that provide I/O, and will be defined as these systems are covered in later parts.
Writing a byte is handled in a very similar manner; each operation is reversed, and values are written to the various regions of memory instead of returned from the function. For this reason, it is not necessary to provide a full extrapolation of the wb
function here.
Just as a CPU emulation is useless without its supporting elements of memory access, graphics and so on, being able to read a program from memory is useless without a program loaded. There are two main ways to pull a program into an emulator: hard-code it into the emulator's source code, or allow for loading of a ROM file from a certain location. The obvious disadvantage of hard-coding the program is that it's fixed, and cannot easily be changed.
In the case of this JavaScript emulator, the GameBoy BIOS is hard-coded into the MMU, because it isn't liable to change; the program file is, however, loaded from the server asynchronously, after the emulator has initialised. This can be done through XMLHTTP, using a binary file reader such as Andy Na's BinFileReader; the result of this is a string containing the ROM file.
MMU.load = function(file) { var b = new BinFileReader(file); MMU._rom = b.readString(b.getFileSize(), 0); };
Since the ROM file is held as a string, instead of an array of numbers, the rb
and wb
functions must be changed to index a string:
case 0x1000: case 0x2000: case 0x3000: return MMU._rom.charCodeAt(addr);
With a CPU and MMU in place, it is possible to watch a program being executed, step by step: an emulation can be achieved, and produce the expected values in the right registers. What's missing is a sense of what that means for graphical output. In the next part of this series, the issue of graphics will be looked at, including how the GameBoy structures its graphic output, and how to render graphics onto the screen.
As with part 1, the source for this article is available at: http://imrannazar.com/content/files/jsgb.mmu.js.
Imran Nazar <tf@imrannazar.com>, Aug 2010.
]]>It's often stated that JavaScript is a special-purpose language, designed for use by web sites to enable dynamic interaction. However, JavaScript is a full object-oriented programming language, and is used in arenas besides the Web: the Widgets available for recent versions of Windows and Apple's Mac OS are implemented in JavaScript, as is the GUI for the Mozilla application suite.
With the recent introduction of the <canvas>
tag to HTML, the question arises as to whether a JavaScript program is capable of emulating a system, much like desktop applications are available to emulate the Commodore 64, GameBoy Advance and other gaming consoles. The simplest way of checking whether this is viable is, of course, to write such an emulator in JavaScript.
This article sets out to implement the basis for a GameBoy emulation, by laying the groundwork for emulating each part of the physical machine. The starting point is the CPU.
The traditional model of a computer is a processing unit, which gets told what to do by a program of instructions; the program might be accessed with its own special memory, or it might be sitting in the same area as normal memory, depending on the computer. Each instruction takes a short amount of time to run, and they're all run one by one. From the CPU's perspective, a loop starts up as soon as the computer is turned on, to fetch an instruction from memory, work out what it says, and execute it.
In order to keep track of where the CPU is within the program, a number is held by the CPU called the Program Counter (PC). After an instruction is fetched from memory, the PC is advanced by however many bytes make up the instruction.
The CPU in the original GameBoy is a modified Zilog Z80, so the following things are pertinent:
In addition to the PC, other numbers are held inside the CPU that can be used for calculation, and they're referred to as registers: A, B, C, D, E, H, and L. Each of them is one byte, so each one can hold a value from 0 to 255. Most of the instructions in the Z80 are used to handle values in these registers: loading a value from memory into a register, adding or subtracting values, and so forth.
If there are 256 possible values in the first byte of an instruction, that makes for 256 possible instructions in the basic table. That table is detailed in the Gameboy Z80 opcode map released on this site. Each of these can be simulated by a JavaScript function, that operates on an internal model of the registers, and produces effects on an internal model of the memory interface.
There are other registers in the Z80, that deal with holding status: the flags register (F), whose operation is discussed below; and the stack pointer (SP) which is used alongside the PUSH and POP instructions for basic LIFO handling of values. The basic model of the Z80 emulation would therefore require the following components:
The internal state can be held as follows:
Z80 = {// Time clock: The Z80 holds two types of clock (m and t)_clock: {m:0, t:0},// Register set_r: { a:0, b:0, c:0, d:0, e:0, h:0, l:0, f:0,// 8-bit registerspc:0, sp:0,// 16-bit registersm:0, t:0// Clock for last instr} };
The flags register (F) is important to the functioning of the processor: it automatically calculates certain bits, or flags, based on the result of the last operation. There are four flags in the Gameboy Z80:
Since the basic calculation registers are 8-bits, the carry flag allows for the software to work out what happened to a value if the result of a calculation overflowed the register. With these flag handling issues in mind, a few examples of instruction simulations are shown below. These examples are simplified, and don't calculate the half-carry flag.
Z80 = {// Internal state_clock: {m:0, t:0}, _r: {a:0, b:0, c:0, d:0, e:0, h:0, l:0, f:0, pc:0, sp:0, m:0, t:0},// Add E to A, leaving result in A (ADD A, E)ADDr_e: function() { Z80._r.a += Z80._r.e;// Perform additionZ80._r.f = 0;// Clear flagsif(!(Z80._r.a & 255)) Z80._r.f |= 0x80;// Check for zeroif(Z80._r.a > 255) Z80._r.f |= 0x10;// Check for carryZ80._r.a &= 255;// Mask to 8-bitsZ80._r.m = 1; Z80._r.t = 4;// 1 M-time taken}// Compare B to A, setting flags (CP A, B)CPr_b: function() { var i = Z80._r.a;// Temp copy of Ai -= Z80._r.b;// Subtract BZ80._r.f |= 0x40;// Set subtraction flagif(!(i & 255)) Z80._r.f |= 0x80;// Check for zeroif(i < 0) Z80._r.f |= 0x10;// Check for underflowZ80._r.m = 1; Z80._r.t = 4;// 1 M-time taken}// No-operation (NOP)NOP: function() { Z80._r.m = 1; Z80._r.t = 4;// 1 M-time taken} };
A processor that can manipulate registers within itself is all well and good, but it must be able to put results into memory to be useful. In the same way, the above CPU emulation requires an interface to emulated memory; this can be provided by a memory management unit (MMU). Since the Gameboy itself doesn't contain a complicated MMU, the emulated unit can be quite simple.
At this point, the CPU only needs to know that an interface is present; the details of how the Gameboy maps banks of memory and hardware onto the address bus are inconsequential to the processor's operation. Four operations are required by the CPU:
MMU = { rb: function(addr) {/* Read 8-bit byte from a given address */}, rw: function(addr) {/* Read 16-bit word from a given address */}, wb: function(addr, val) {/* Write 8-bit byte to a given address */}, ww: function(addr, val) {/* Write 16-bit word to a given address */} };
With these in place, the rest of the CPU instructions can be simulated. Another few examples are shown below:
// Push registers B and C to the stack (PUSH BC)PUSHBC: function() { Z80._r.sp--;// Drop through the stackMMU.wb(Z80._r.sp, Z80._r.b);// Write BZ80._r.sp--;// Drop through the stackMMU.wb(Z80._r.sp, Z80._r.c);// Write CZ80._r.m = 3; Z80._r.t = 12;// 3 M-times taken},// Pop registers H and L off the stack (POP HL)POPHL: function() { Z80._r.l = MMU.rb(Z80._r.sp);// Read LZ80._r.sp++;// Move back up the stackZ80._r.h = MMU.rb(Z80._r.sp);// Read HZ80._r.sp++;// Move back up the stackZ80._r.m = 3; Z80._r.t = 12;// 3 M-times taken}// Read a byte from absolute location into A (LD A, addr)LDAmm: function() { var addr = MMU.rw(Z80._r.pc);// Get address from instrZ80._r.pc += 2;// Advance PCZ80._r.a = MMU.rb(addr);// Read from addressZ80._r.m = 4; Z80._r.t=16;// 4 M-times taken}
With the instructions in place, the remaining pieces of the puzzle for the CPU are to reset the CPU when it starts up, and to feed instructions to the emulation routines. Having a reset routine allows for the CPU to be stopped and "rewound" to the start of execution; an example is shown below.
reset: function() { Z80._r.a = 0; Z80._r.b = 0; Z80._r.c = 0; Z80._r.d = 0; Z80._r.e = 0; Z80._r.h = 0; Z80._r.l = 0; Z80._r.f = 0; Z80._r.sp = 0; Z80._r.pc = 0;// Start execution at 0Z80._clock.m = 0; Z80._clock.t = 0; }
In order for the emulation to run, it has to emulate the fetch-decode-execute sequence detailed earlier. "Execute" is taken care of by the instruction emulation functions, but fetch and decode require a specialist piece of code, known as a "dispatch loop". This loop takes each instruction, decodes where it must be sent for execution, and dispatches it to the function in question.
while(true) { var op = MMU.rb(Z80._r.pc++);// Fetch instructionZ80._map[op]();// DispatchZ80._r.pc &= 65535;// Mask PC to 16 bitsZ80._clock.m += Z80._r.m;// Add time to CPU clockZ80._clock.t += Z80._r.t; } Z80._map = [ Z80._ops.NOP, Z80._ops.LDBCnn, Z80._ops.LDBCmA, Z80._ops.INCBC, Z80._ops.INCr_b, ... ];
Implementing a Z80 emulation core is useless without an emulator to run it. In the next part of this series, the work of emulating the Gameboy begins: I'll be looking at the Gameboy's memory map, and how a game image can be loaded into the emulator over the Web.
The complete Z80 core is available at: http://imrannazar.com/content/files/jsgb.z80.js; please feel free to let me know if you encounter any bugs in the implementation.
Imran Nazar <tf@imrannazar.com>, Jul 2010.
]]>In PHP, one of the more common ways to implement this is to generate multiple "phrase files", each of which contains all the phrases required for a given language. Each phrase is defined as a constant; an example of such a phrase file could run as follows.
define('_WELCOME', 'Hoşgeldiniz!'); define('_FREE_DELIVERY', 'Ücretsiz dağıtma'); define('_BRANDS', 'Markalar'); define('_COMPUTERS', 'Bilgisayarlar');
The object-oriented approach yields another method of holding these translation phrases, as class constants within a phrases class. This alternative approach would be performed as below.
class Phrases { const WELCOME = 'Hoşgeldiniz!'; const FREE_DELIVERY = 'Ücretsiz dağıtma'; const BRANDS = 'Markalar'; const COMPUTERS = 'Bilgisayarlar'; }
Since there are probably a large number of these phrases, it makes sense to seek the most memory-efficient of these two constructs: in other words, the one which takes the least amount of memory to be held. It might be expected that both methods of defining the phrase file would cause the same amount of memory to be taken up; this is not the case.
(A note: the analysis presented below is based on the definitions held in PHP 5.2.10; the values produced may differ in future versions.)
The first of the two methods above defines a series of constants in the global scope; internally to PHP, these were initially stored as an array of struct zend_constant
. The layout of this data structure, and the memory used, are derived from the basic zval
structure used by PHP to hold values.
This data structure is memory-efficient, being able to store the name and value of the constant, as well as the other data required by PHP for a value, such as the reference count. The problem with defining zend_constants
as an array is that the time taken to find a particular constant grows linearly with the number of constants that are defined; since the PHP interpreter itself sets constants such as PHP_VERSION
, there will always be a disadvantage for user code in terms of speed.
To resolve this, a HashTable
data structure was introduced with PHP 3, to allow for logarithmic-time searching of the constants. The HashTable
uses Bucket
s to store its data entries, each of which has a name attached.
The trade-off in using a hash structure like this, is that more memory is taken for storage of the hashes and key lengths. In total, storage of a global constant takes 42 bytes before the strings for name and value are counted.
With the introduction of objects in PHP 4, a new line of thinking was employed for class-level constants: if a HashTable
is used, and a hash of the constant name is stored, the name itself doesn't need to be stored alongside it. This allows a good chunk of space to be saved, and the structure of the storage to be simplified somewhat.
As can be seen here, the name of the constant isn't stored at all once its hash has been calculated. This means that the zend_constant
structure can be eliminated, leaving only the zval
. As a result, a class constant needs 26 bytes to be stored, before the value is counted.
Two conclusions can be drawn from this analysis:
Imran Nazar <tf@imrannazar.com>, June 2010.
]]>The main issue in drawing a Venn diagram is, given the percentage of overlap, determining the placement of the circles such that it visually matches the stated overlap. Once the circle dimensions and placements have been worked out, the image manipulation is relatively straightforward.
In this article, I'll be using PHP to demonstrate the implementation, and the imagick
interface to ImageMagick in order to draw and output the image.
Mathematically, two overlapping circles will cross each other at two points: a line between these two points is a chord of the circles, and the area contained within each chord segment by this line is 50% of the total overlap.
In order to correctly place the circles, it's important to find out what x and h are in the above diagram; knowing these values will allow for easy calculation of the horizontal positions. By using the standard formulae for area and angle of a circle segment, the following equation can be obtained.
By solving this equation, we can get the length of the sagitta, x. The problem presented by this equation, however, is that it cannot be solved analytically by working with the equation terms. A numerical approach will need to be used, to find a solution.
One of the most common numerical algorithms for solving an equation is the Newton-Raphson method, also known as Newton's method. It uses the gradient of the function at a particular point, to guess the next point. By picking a good starting point, it's possible to quickly narrow down a solution to the function (the point at which it crosses the x-axis).
As can be seen in the above figure, the algorithm follows the gradient line down to the x-axis, and uses the crossing point there as its next guess for the solution. Taking another gradient from the function at that point, the algorithm homes in on the solution within (in the above case) 4 or 5 iterations. When used on a formula, the gradient is represented by the differential of the formula in question; for the sagitta length formula, the differential is:
With both formulae to hand (the function itself and the differential), the iteration process is a simple calculation:
This calculation can be repeated until the answer is close to the expected solution: in other words, when successive iterations don't result in a significant change to the answer. The definition of "significant change" depends on the problem: in this case, I'll be using "the same to four decimal places".
As an example, suppose that the Venn diagram in Figure 1 is being generated: a diagram showing 20% overlap, where each circle has a radius of 150 pixels. Plugging these values into the Newton-Raphson solver shows the following values for each iteration.
0.0000 94.2478 102.7742 103.0570 103.0573 103.0573
As can be seen, the solver quickly converges on the answer for the length x. From here, h can be calculated as the difference between x and the radius, and the angle θ as:
In PHP, the solver can be implemented by defining the sagitta formula and its differential as two functions, and using a recursive function to run through their values. The following implementation contains a "safety valve" for the solver, for the general case where the equations may cause a divergence if the solver starts at x=0. In the case of this equation, the safety valve is unnecessary, since the algorithm will always converge if it starts at 0; it is included below for completeness.
class Sagitta {// The sagitta length formulastatic function f($x, $r, $P) { return acos($x/$r)-(M_PI*$P)-(($x/($r*$r))*sqrt(($r+$x)*($r-$x))); }// Differential of the lengthstatic function fp($x, $r, $P) { $s = sqrt(1-(($x*$x)/($r*$r))); return (((($x*$x)/($r*$r*$r)) - (1/$r)) / $s) - ($s/$r); }// Recursive solver // Built-in safety valve at 10 levels downstatic function solve($x, $r, $P, $level=10, $precision=0.0001) { $xn = $x - (self::f($x,$r,$P) / self::fp($x,$r,$P)); if($level && (abs($xn - $xn) > $precision)) return self::solve($xn, $r, $P, $level-1); else return $xn; } } $radius = 150; $overlap = 0.2;// Each circle contains half of the overlap; use this to calculate x$x0 = 0; $x = Sagitta::solve($x0, $radius, $overlap/2);
Using PHP and imagick
, the Venn diagram can be drawn quickly and efficiently based on the value for x obtained above. There are, however, a few issues that must be resolved:
imagick
, its centre coordinate must be given, and this must be calculated horizontally for both circles. For the left-hand circle, this is simply one radius in from the left of the image. The right-hand circle would be three radii from the left edge, if there were no overlap; from Figure 2, it can be seen that there is 2h of overlap, so this must be subtracted from the horizontal coordinate of the right-hand circle.imagick
provides a construct for drawing an ellipse segment: given two angles, it will plot the arc and chord between them, and fill the space in the "fill colour" defined beforehand.Having taken these issues into account, the following code will generate the Venn diagram given x.
$h = $radius - $x; $theta = acos($x/$r) * (180 / M_PI);// 5 pixels of padding around the Venn$padding = 5; $overlap_width = 2*$h; $im = new Imagick(); $im->newImage($r*4 - $overlap_width + ($padding*2), $r*2 + ($padding*2), new ImagickPixel('white')); $draw = new ImagickDraw();// Left-hand circle, in green$draw->setFillColor(new ImagickPixel('#88ff88')); $draw->ellipse($r + $padding, $r + $padding, $r, $r, 0, 360);// Right-hand circle, in blue$draw->setFillColor(new ImagickPixel('#8888ff')); $draw->ellipse($r*3 - $overlap_width + $padding, $r + $padding, $r, $r, 0, 360);// Intersection, in cyan // Angles are specified in degrees, from the rightmost point of the circle$draw->setFillColor(new ImagickPixel('#88ffff'));// Left-hand segment (right half of intersection) // -theta is in the top right, +theta in the bottom right$draw->ellipse($r + $padding, $r + $padding, $r, $r, -$theta, $theta);// Right-hand segment (left half of intersection) // 180-theta is in the bottom left, 180+theta in the top left$draw->ellipse($r*3 - $overlap_width + $padding, $r + $padding, $r, $r, 180-$theta, 180+$theta);// Image bounding rectangle$draw->setStrokeColor(new ImagickPixel('black')); $draw->setStrokeWidth(1); $draw->setFillOpacity(0); $draw->rectangle(0, 0, $r*4 - $overlap_width + ($padding*2) - 1, $r*2 + $padding - 1);// Output image$im->drawImage($draw); $im->setImageFormat('png'); header('Content-type: image/png'); echo $im;
The above code results in Figure 1.
One problem that remains with this implementation is the range of overlap percentages. If an overlap of less than 0% is given (if, in other words, the sets don't overlap), the equations above result in complex roots and PHP crashes while attempting to calculate them. Similarly, if the overlap is specified as more than 100%, this should reverse the positions of the sets in the Venn diagram; instead, the equations will produce a small section of one circle which is rendered as all intersection. A simple range check on the overlap percentage can alleviate these issues, and prevent them from being passed through to the script.
Another limitation is the inherent tie of two sets that is implied by this script; it is not possible to specify an overlap between three sets using this model. The geometry to allow for three intersecting circles to be specified is left as an exercise for the reader.
Imran Nazar <tf@imrannazar.com>, May 2010.
]]>Apart from a motorcycle on I-70. The road is rutted, but she makes her way around the holes, weaving across the lanes towards Denver. She'd found out about Cheyenne, and had cut across fields to avoid the cordon; there was something under the mountain, and she was here to find out what.
She rolls up to the tunnel through the mountain, filled in by the collapse after the meltdown. The infill has been smoothed over, and a porthole window fitted at around waist height. She peers in, expecting to see only rock and stone, and sees —
Light. Spirals of light against a perfectly dark background, as if there were galaxies of stars through that porthole. A whole universe, underneath the mountain.
They say the universes in this ring start in the same place, and grow outward. They don't say where that place is.
]]>Ryan explained that, for testing purposes, the subject would be a starfish from the Bond Street aquarium. He wheeled the laser in, and fired; the starfish promptly vanished.
"I thought you were demonstrating levitation to six feet, Dr Ryan," said one of his assistants, as they gathered over a star-shaped hole in the pavement.
"The laser must be upside down; give me a minute," Ryan answered.
]]>Discordianism uses a calendar inspired by the well-established Gregorian calendar, with prominence given to the number five. As opposed to twelve months, there are five seasons with a regular number of days in each season:
Season | Days |
---|---|
Chaos | 73 |
Discord | 73 |
Confusion | 73 |
Bureaucracy | 73 |
The Aftermath | 73 |
This results in a year of 365 days, in alignment with the Gregorian calendar; as a result, a given day in the Discordian calendar always corresponds to the same day in Gregorian.
In addition to there being five seasons, each week consists of five days: Sweetmorn, Boomtime, Pungenday, Prickle-Prickle and Setting Orange. Because the calendar is aligned to Gregorian, each Discordian year consists of 73 weeks of 5 days; because of this, each day in the calendar always has both the same day name and the same date.
In the Gregorian calendar, leap days are added in 97 out of 400 years, on a 4-yearly cycle. The same process applies in Discordianism, with St. Tib's day inserted between Chaos the 59th and 60th (February 28th and March 1st).
A final detail is that the Discordian calendar begins in 1166 BCE; years are counted in step with Gregorian since that time, and marked anno discordia, or "Years of Our Lady of Discord". A few examples of dates in both calendars follow.
Chaos 1st, 3000 = January 1st, 1834 Bureaucracy 70th, 3155 = October 16th, 1989 St. Tib's Day, 3178 = February 29th, 2012
Because the Discordian calendar is very regular, conversion between Discordian and Gregorian dates is relatively simple. All that is required is to calculate the offsets for year, day and month.
DYear = Year + 1166; Handle leap yearsIF Year is-a-leap-year THEN IF Day = 59 THEN DSeasonday = "St. Tib's Day" ELSE IF Day > 59; Days after Feb 29th need to be shifted up to make this ; year into a regular 365-day year, for calculation purposesDay = Day - 1 END IF END IF SeasonNames = ["Chaos", "Discord", "Confusion", "Bureaucracy", "The Aftermath"] DayNames = ["Sweetmorn", "Boomtime", "Pungenday", "Prickle-Prickle", "Setting Orange"] IF DSeasonday is-not-already-set THEN DSeason = SeasonNames[Day / 73] DWeekday = DayNames[Day MOD 5] DSeasonday = Day MOD 73 END IF
The above is a pseudocode sample for converting a date into Discordian, and takes account of the special case for leap years. Converting from Discordian back into Gregorian dates is similarly simple. The only complication involved is the leap-year case, where the date as reported by the Discordian calendar is one day ahead of where it would be in other years.
Year = DYear - 1166 Day = (DSeasonNum - 1) * 73 + DSeasondayNum IF Year is-a-leap-year THEN IF DSeasonday = "St. Tib's Day" THEN Day = 60 ELSE IF Day >= 60 Day = Day + 1 END IF END IF
Writing the above algorithms in Java is made very simple by the existence of java.util.Calendar
, the date/calendar calculation class; in particular, the GregorianCalendar
subclass allows for calculation of leap years in a quick and efficient manner. The following code implements conversions from one calendar to the other, providing a readable representation of the date in either case.
import java.util.Date; import java.util.Calendar; import java.util.GregorianCalendar; public class ddate { private int _year, _season, _yearDay, _seasonDay, _weekDay; private boolean _isLeap; private String[] _seasonNames = {"Chaos","Discord","Confusion","Bureaucracy","The Aftermath"}; private String[] _dayNames = {"Sweetmorn","Boomtime","Pungenday","Prickle-Prickle","Setting Orange"}; public ddate(Date d) { GregorianCalendar gc = new GregorianCalendar(); gc.setTime(d); _year = gc.get(Calendar.YEAR) + 1166; _yearDay = gc.get(Calendar.DAY_OF_YEAR); _isLeap = gc.isLeapYear(gc.get(Calendar.YEAR)); int yd = _yearDay - 1; if(_isLeap && yd > 59) yd--; _season = (yd / 73) + 1; _weekDay = (yd % 5) + 1; _seasonDay = (yd % 73) + 1; } public int getYear() { return _year; } public int getSeason() { return _season; } public int getYearDay() { return _yearDay; } public int getSeasonDay() { return _seasonDay; } public String getSeasonName() { return _seasonNames[_season-1]; } public String getDayName() { return _dayNames[_yearDay-1]; } public String toString() { if(_isLeap && _yearDay == 59) { return "St. Tib's Day, " + Integer.toString(_year); } else { return _dayNames[_weekDay-1] + ", " + _seasonNames[_season-1] + " " + Integer.toString(_seasonDay) + ", " + Integer.toString(_year); } } public Date getTime() { GregorianCalendar gc = new GregorianCalendar(); gc.set(Calendar.YEAR, _year - 1166); gc.set(Calendar.DAY_OF_YEAR, _yearDay); return gc.getTime(); } }
Calendar foo = new Calendar(); foo.setTime(new Date()); foo.add(Calendar.DAY, -1); ddate bar = new ddate(foo.getTime()); System.out.println("Yesterday was " + bar.toString());
The conversion process detailed above doesn't include the ten Holy Days of the Discordian calendar: a Holy Day falls on the 5th and 50th of each season. Since these Days occur with such regularity, it poses no extra difficulty to provide an interface for this, and such an interface has been left out of the above code in the interest of brevity.
Imran Nazar <tf@imrannazar.com>, Mar 2010.
]]>ignore
command is a gift from the Gods: it allows you to ignore any output being generated by a particular person or nickname, whether it's plain noise or something downright malicious. In the irssi
client, you can go one further and specify ignore for any messages which match a given regular expression, from whichever nick they originate.
In some cases, you may not wish to ignore a person entirely; they may have the occasional insight into a topic, but simply act foolishly for most of the day. Alternatively, a particularly chatty bot may have some useful features, but will simply get in the way of channel discussion most of the time. In these situations, a halfway house between full participation and total ignorance is required.
If the channel window of a given IRC client has a standard high-contrast colour scheme (either white text on a black background, or vice versa), it's trivial to define a halfway point between text being fully visible and text being altogether hidden from view: grey text. Similarly, any lines of text that need particular attention paid to them can be highlighted in red, as an example. What is required is a way to denote lines worthy of attention, and lines eligible for half-ignore.
For the irssi
client, the nickcolor
extension comes close: it is able to assign a specific colour to a given nick, and highlight the rest in random colours. Unfortunately, there are a few drawbacks with nickcolor
as it stands:
nickcolor
only works by nickname; it cannot apply regex matching on messages for the purpose of highlighting.To address these issues, an adapted version of the nickcolor
script is put forward in this article, which I've renamed as linecolor
. An example of its usage would be as follows.
linecolor
: Sample usage/color set Bucket 14 /color rset ^Bucket 14
The above rules specify that any output from "Bucket" is to be marked as colour #14 (grey), and any output from any nick starting with the word "Bucket" is also to be marked grey. The resulting output is shown below.
The code is shown below, and is also available from http://imrannazar.com/content/img/linecolor.txt.
linecolor
: An irssi script for rule-based line colouring# Line Color - Assign colours to lines from specific nicks, or matching patterns # Adapted from "Nick Color" by Timo Sirainen, as modified by Ian Petersiuse strict; use Irssi 20020101.0250 (); use vars qw($VERSION %IRSSI); $VERSION = "1.2"; %IRSSI = ( authors => "Timo Sirainen, Ian Petersi, Imran Nazar", contact => "tss\@iki.fi", name => "Line Color", description => "assign colours to lines through nick/regex rules", license => "Public Domain", url => "http://irssi.org/", changed => "2010-01-28T18:30+0000" );# hm.. i should make it possible to use the existing one..Irssi::theme_register([ 'pubmsg_hilight', '{pubmsghinick $0 $3 $1}$2' ]); my %saved_colors; my %saved_regex_colors; my %session_colors = {}; my @colors = qw/2 3 4 5 6 7 9 10 11 12 13/; sub load_colors { open COLORS, "$ENV{HOME}/.irssi/saved_colors"; while (<COLORS>) {# I don't know why this is necessary only inside of irssimy @lines = split "\n"; foreach my $line (@lines) { my($type, $nick, $color) = split ":", $line; if ($type eq "NICK") { $saved_colors{$nick} = $color; } elsif ($type eq "REGEX") { $saved_regex_colors{$nick} = $color; } } } close COLORS; } sub save_colors { open COLORS, ">$ENV{HOME}/.irssi/saved_colors"; foreach my $nick (keys %saved_colors) { print COLORS "NICK:$nick:$saved_colors{$nick}\n"; } foreach my $regex (keys %saved_regex_colors) { print COLORS "REGEX:$regex:$saved_regex_colors{$regex}\n"; } Irssi::print("Saved colors to $ENV{HOME}/.irssi/saved_colors"); close COLORS; }# If someone we've colored (either through the saved colors, or the hash # function) changes their nick, we'd like to keep the same color associated # with them (but only in the session_colors, ie a temporary mapping).sub sig_nick { my ($server, $newnick, $nick, $address) = @_; my $color; $newnick = substr ($newnick, 1) if ($newnick =~ /^:/); if ($color = $saved_colors{$nick}) { $session_colors{$newnick} = $color; } elsif ($color = $session_colors{$nick}) { $session_colors{$newnick} = $color; } } sub find_color { my ($server, $msg, $nick, $address, $target) = @_; my $chanrec = $server->channel_find($target); return if not $chanrec; my $nickrec = $chanrec->nick_find($nick); return if not $nickrec; my $nickmode = $nickrec->{op} ? "@" : $nickrec->{voice} ? "+" : "";# Has the user assigned this nick a color?my $color = $saved_colors{$nick};# Have -we- already assigned this nick a color?if (!$color) { $color = $session_colors{$nick}; }# Does the message match any color regexen?if (!$color) { foreach my $r (keys %saved_regex_colors) { if ($msg =~ m/($r)/i) { $color = $saved_regex_colors{$r}; last; } } } if (!$color) { $color = 0; } return $color; }# FIXME: breaks /HILIGHT etc.sub sig_public { my ($server, $msg, $nick, $address, $target) = @_; my $color = find_color(@_); if ($color) { $color = "0".$color if ($color < 10); $server->command('/^format pubmsg {pubmsgnick $2 {pubnick '. chr(3).$color.'$0'. chr(3).'15}}'.chr(3).$color.'$1'); } else { $server->command('/^format pubmsg {pubmsgnick $2 {pubnick $0}}$1'); } } sub sig_action { my ($server, $msg, $nick, $address, $target) = @_; my $color = find_color(@_); if($color) { $server->command('/^format action_public {pubaction '. chr(3).$color.'$0'. chr(3).'15}'.chr(3).$color.'$1'); } else { $server->command('/^format action_public {pubaction $0}$1'); } } sub cmd_color { my ($data, $server, $witem) = @_; my ($op, $nick, $color) = split " ", $data; $op = lc $op; if (!$op || $op eq "help") { Irssi::print ("Supported commands: preview (list possible colors and their codes) list (show current entries in saved_colors) set <nick>(associate a color to a nick) rset <regex> ); } elsif ($op eq "save") { save_colors; } elsif ($op eq "set") { if (!$nick) { Irssi::print ("Nick not given"); } elsif (!$color) { Irssi::print ("Color not given"); } elsif ($color < 2 || $color > 14) { Irssi::print ("Color must be between 2 and 14 inclusive"); } else { $saved_colors{$nick} = $color; } Irssi::print ("Added ".chr (3) . "$saved_colors{$nick}$nick" . chr (3) . "1 ($saved_colors{$nick})"); } elsif ($op eq "rset") { if (!$nick) { Irssi::print ("Regex not given"); } elsif (!$color) { Irssi::print ("Color not given"); } elsif ($color < 2 || $color > 14) { Irssi::print ("Color must be between 2 and 14 inclusive"); } else { $saved_regex_colors{$nick} = $color; } Irssi::print ("Added ".chr (3) . "$saved_regex_colors{$nick}$nick" . chr (3) . "1 ($saved_regex_colors{$nick})"); } elsif ($op eq "clear") { if (!$nick) { Irssi::print ("Nick not given"); } else { delete ($saved_colors{$nick}); } Irssi::print ("Cleared ".$nick); } elsif ($op eq "rclear") { if (!$nick) { Irssi::print ("Regex not given"); } else { delete ($saved_regex_colors{$nick}); } Irssi::print ("Cleared ".$nick); } elsif ($op eq "list") { Irssi::print ("\nSaved colors:"); foreach my $nick (keys %saved_colors) { Irssi::print ("Nick: ".chr (3) . "$saved_colors{$nick}$nick" . chr (3) . "1 ($saved_colors{$nick})"); } foreach my $r (keys %saved_regex_colors) { Irssi::print ("Regex: ".chr (3) . "$saved_regex_colors{$r}$r" . chr (3) . "1 ($saved_regex_colors{$r})"); } } elsif ($op eq "preview") { Irssi::print ("\nAvailable colors:"); foreach my $i (2..14) { Irssi::print (chr (3) . "$i" . "Color #$i"); } } } load_colors; Irssi::command_bind('color', 'cmd_color'); Irssi::signal_add('message public', 'sig_public'); Irssi::signal_add('message irc action', 'sig_action'); Irssi::signal_add('event nick', 'sig_nick');(colorize messages matching a regex) clear <nick> (delete color associated to nick) rclear <regex> (delete color associated to regex) save (save colorsettings to saved_colors file)"
Imran Nazar <tf@imrannazar.com>, Feb 2010.
]]>Any alternative means of finding out what the Captcha image shows must be accessible for people who can't view the image, while also presenting a level of difficulty for automated and spam submissions; an option that meets both of these criteria is the audio Captcha.
The idea behind an audio Captcha is simple: in addition to providing the Captcha image on-screen, a sound file representing the image is made available. This caters for most users who would otherwise be unable to enter the Captcha text. This sound file can be a simple RIFF wave file, but is more often encoded into a speech codec or the ubiquitous MP3 format.
In this article, I'll be looking at the implementation of a simple MP3 audio Captcha, which takes a short string of a few characters and creates a sound file. I'll assume for this article that it's only made up of lowercase letters; there are no digits or uppercase letters, and no punctuation, in order to keep things at a minimal level. The audio Captcha algorithm is based on a series of sound files, each representing one letter, which can then be concatenated into a representation of the whole string.
In the ideal case, it would be simple to take the contents of each file and run them together into one large file, by writing out the contents of the files one after the other. This would be a trivial concatenation process, but will unfortunately not work. For the reason behind that, it's important to look at what makes up a RIFF wave file.
A RIFF wave file is more than a basic recording of the digitised waveform; in addition to the waveform data, metadata is attached regarding the size of the data and its origin.
Byte 1 | Byte 2 | Byte 3 | Byte 4 |
---|---|---|---|
Chunk header: "RIFF" | |||
RIFF chunk size (file size-8) | |||
Chunk header: "WAVE" | |||
Subchunk header: "fmt " | |||
Format chunk size | |||
Format (1=PCM) | Channel count | ||
Sampling rate (Hz) | |||
Bytes per second | |||
Block alignment value | Bits per sample | ||
Subchunk header: "data" | |||
Data size | |||
File data | |||
The table above shows the format of the simplest RIFF wave file. The format is capable of holding information about wave files intended for MIDI samplers, cue points for mixing, and various other additions; most wave files will not contain these, and will simply be a record of the waveform data with a header attached.
As can be seen, the wave file specifies not only the length of the digitised waveform, but also its sampling rate and channel count. A telephone-level wave file can easily be distinguished from a CD-quality file, by simply checking the sampling rate; in a similar manner, stereo waveform files and monoscopic files can be differentiated. The provision of this metadata about the file is the reason for the attachment of the header, since otherwise a sound player application would have no idea of the process for playing the sound file.
Unfortunately, this means that simple concatenation of two RIFF files won't result in a longer RIFF file. A sound player will read the headers at the start of the file, which indicate the length of the first segment to be concatenated, and play that segment; at this point, a reasonable player will deduce that the end of file has been reached, since its record of played samples is the same as the number indicated in the file header, and won't play any more of the file.
The solution to this problem is to use a more complex concatenation: instead of simply throwing the files together, they will need to be run through an external sound processor.
The sox
command is a simple interface to an audio concatenation and processing tool, which can be used for this audio Captcha. If each letter's wave file is passed into sox, a wave file can be output consisting of all the input files together, with an updated format header containing the total data size and overall sampling rates. An example invocation would run as follows:
sox a.wav x.wav m.wav b.wav -t .wav axmb.wav
Since each letter is contained in its own wave file, it's a trivial matter to break up the Captcha text string and build a command line for sox
to use. The following example assumes that the Captcha script is written in PHP, and the text is held in the session data after generation.
$parts = array(); for($i = 0; $i < strlen($_SESSION['captcha']); $i++) $parts[] = $_SESSION['captcha'][$i] . '.wav'; exec(sprintf('sox %s -t .wav %s.wav', join(' ', $parts), $_SESSION['captcha']));
What this doesn't do is generate an MP3 representing the Captcha text; for that, an MP3 encoder is required. lame
allows for the encoding of MP3s at various sampling rates, but will normally take its sampling information from the input file. Since, as detailed above, a wave file contains detailed information about sampling and formatting, lame
is able to use this to generate an MP3 file.
The example below is a slight modification of the sox
invocation above, in order to pipe the output to lame
and encode an MP3 file, and then to serve the MP3 out as a downloadable file.
$parts = array(); $c = $_SESSION['captcha']; for($i = 0; $i < strlen($c); $i++) $parts[] = $c[$i] . '.wav'; exec(sprintf('sox %s -t .wav - | lame - %s.mp3', join(' ', $parts), $c)); header('Content-type: audio/mpeg'); header('Content-length: '.filesize("{$c}.mp3")); header('Content-disposition: attachment; name="'.$c.'.mp3"'); passthru("{$c}.mp3");
An example of this script's usage in a Captcha would be as follows.
In the above example, clearly voiced phrases have been used for the constituent letters of the audio Captcha. This provides a good level of accessibility, but compromises the security of the audio Captcha: any automatic circumventions will easily be able to work out the letters that make up the audio file. One solution to this is to overlay a level of noise on the audio file, to provide some level of obfuscation to the output; in addition to this, periods of silence can be inserted between the letter waveforms, making the output less regular.
Another enhancement that can be made to the audio Captcha output is to provide more formats for the file. At present, the audio Captcha is generated in RIFF wave and MP3 formats; provision for Windows audio and Ogg formats would allow for more widespread usage of the output file.
Imran Nazar <tf@imrannazar.com>, Jan 2010.
]]>Petter Källström emailed me, saying that there are a couple of weirdnesses and inefficiencies in the implementation given at the bottom of this article, and provided alternative code as follows:
-- input = r18, C and H flag-- output = r18 and C flagDAA: push r19 in r19, SREG; Let r19 contain the SREGcpi r18, $9A; Set C flag in r19 if r18 >= 9A, and H flag in SREG if lower nibble is < 10brlo DAA_endif sbr r19, (1<<SREG_C) DAA_endif: sbrs r19, SREG_H; If input H flag was set, then skip the H-flag testbrhs DAA_hi; If H indicate lower nibble is < 10, then jump over...subi r18, -$06; adjust (adjust if r19.H-flag set, or if lower nibble >= 10)DAA_hi: sbrc r19, SREG_C; If output C=1subi r18, -$60; ...adjustout SREG, r19 pop r19 ret
The Atmel AVR series of microcontrollers can be used in a wide variety of applications, from radios right through to inkjet printers, but a popular application in hobbyist projects is for digital clocks and counters. There are two main portions to any clock or counter program: a piece of code to increment the internal counter, and another piece of code to format and output the counter on a display.
A problem arises where these two portions of code need to interact. If one of these tasks is made less taxing for the microcontroller, the other is made more complicated; as a result, there are two prevailing schools of thought on how to achieve this interaction.
This article will examine the implications of choosing the second method: the use of a BCD number to hold the counter.
The concept behind BCD is a simple one: instead of using a byte to represent any value between 0 and 255, a byte is used to represent the decimal digits only: 0 to 9. This allows for each segment of a multiple-digit display to be tied directly to a byte in the number to show, which greatly simplifies the logic behind showing the number.
The disadvantage of using a full byte to represent each digit is the waste produced: over 95% of the usable range of numbers in a byte is lost, and a large number of bytes have to be stored for a number of significant size. In a microcontroller environment, where memory space is often at such a premium that one extra byte is significant, this wastage is simply untenable.
An alternative scheme is to use each nybble of a byte to store a BCD digit: in this manner, two digits can be stored inside a byte, increasing the range of values available for storage ten-fold. The code required to pull out digits for display is still very straightforward, since simple boolean operations will yield the required result.
A packed BCD number can be held in half the space of the equivalent full-BCD value, and is a viable compromise between the full range of binary numbers and the ease of display of full-BCD. In addition to this, packed BCD (hereafter referred to as simply "BCD") can be trivially conceptualised, through conversion to hexadecimal: as an example, the BCD value 0x93
represents decimal 93.
Using BCD to display a decimal number may simplify the display logic a great deal compared to the alternative, but a problem arises when calculations need to be done on the numbers. A microcontroller, much like any other computer of the modern age, is a binary machine with a binary arithmetic unit: it has no understanding of BCD, and will dutifully treat each number coming into it as a plain binary number.
0x15 + 0x03 = 0x18 0x72 + 0x07 = 0x79 0x38 + 0x02 = 0x3A
It is in additions that cause a carry between digits that the problem appears. In the above example, the BCD numbers 0x38
and 0x02
should add to 0x40
, but the addition has operated instead on the plain numbers and produced the wrong answer. What is required is a method of adjusting the value after addition, to account for the fact that the values being operated on are BCD.
The Intel IA-32 series of microprocessors contains such a method as part of the base instruction set: Decimal Adjust after Addition (DAA). If this instruction is run after an addition, the result stored in the accumulator will be adjusted.
mov al, 38h add al, 03h; At this point, al = 0x3Bdaa; al = 0x41
The Atmel AVR doesn't contain such a convenient instruction as DAA, but the algorithm behind the DAA instruction is documented as part of the Intel IA-32 Reference manual, and is simple both to understand and to re-implement.
DAA will adjust a BCD value that has had a carry occur between digits. There are two situations where this applies:
Most processor architectures maintain a status flag denoting when a byte has carried past its maximum value; many architectures also maintain a half-carry flag, that is set when the lower nybble of a byte carries into the upper nybble. The half-carry flag will be set by a BCD addition that causes a binary carry in the lower digit, so checking for this will satisfy the other half of the DAA check.
If the DAA check finds a digit that needs adjusting, the fix is simple: a further addition onto the nybble in question.
0x08 + 0x03 = 0x0B; Should be 0x110x09 + 0x05 = 0x0E; Should be 0x140x09 + 0x08 = 0x11; Should be 0x17
In every case, the value is six away from where it should be, so the adjustment adds six to bring the value back into BCD. Applying this process to both nybbles yields the final DAA algorithm.
OLD_value = Value OLD_carry = Carry from addition# Check lower nybbleIF (Half-carry set by addition) OR (Lower nybble of Value > 9) ADD 6 to Value FI# Check upper nybble # Upper nybble will be over 9 if original Value was over 0x99IF(OLD_carry) OR (OLD_value > 0x99) ADD 0x60 to Value Carry = 1# BCD value carry occurredELSE Carry = 0 FI
The DAA algoithm sets the carry flag based on whether the upper nybble overflowed; this allows DAA to be used on BCD values across multiple bytes, by employing addition-with-carry on any higher denominations.
Translating DAA from the algorithm detailed above results in the following AVR code.
; Parameters: R16 = value to adjust ; Returns: R16 = Adjusted value ; Carry flag set if adjustment caused BCD carryDAA: push r16 push r17 push r18 push r19 push r16 mov r17, r16 mov r18, r16 in r19, SREG andi r19, (1<<SREG_C) clc brhs DAA_adjlo andi r17, 0x0F cpi r17, 10 brlo DAA_hi DAA_adjlo: ldi r17, 6 add r16, r17 DAA_hi: tst r19 brne DAA_adjhi pop r17 cpi r17, 0x9A brlo DAA_nadjhi DAA_adjhi: ldi r17, 0x60 add r16, r17 sec rjmp DAA_end DAA_nadjhi: clc DAA_end: pop r19 pop r18 pop r17 pop r16 ret
Usage of the DAA routine for a two-byte BCD value stored in SRAM, would work like this:
; Add BCD 57 to the value stored at SRAM:0x100ldi xl, 0x00 ldi xh, 0x01; Read in low byte, add 57 BCD, and storeld r16, x ldi r17, 0x57 add r16, r17 call DAA st x+, r16; Read in high byte, add carry from low byte, and storeld r16, x clr r17 adc r16, r17 call DAA st x, r16
The Intel IA-32 instruction set also contains a routine for decimal adjustment after subtraction, which operates in a slightly different manner to that detailed above. Development of such a routine for AVR is beyond the scope of this article, but can be done in a similar vein to DAA by pulling the algorithm from the IA-32 Reference manual.
If this routine proves useful, or if you come across any bugs with its operation, please feel free to let me know.
Imran Nazar <tf@imrannazar.com>, Dec 2009. Code released into the public domain.
]]>Updated: [2010-07-28] Details added for SWAP, [2010-07-29] XOR n reinstated
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0x | NOP | LD BC,nn | LD (BC),A | INC BC | INC B | DEC B | LD B,n | RLC A | LD (nn),SP | ADD HL,BC | LD A,(BC) | DEC BC | INC C | DEC C | LD C,n | RRC A |
1x | STOP | LD DE,nn | LD (DE),A | INC DE | INC D | DEC D | LD D,n | RL A | JR n | ADD HL,DE | LD A,(DE) | DEC DE | INC E | DEC E | LD E,n | RR A |
2x | JR NZ,n | LD HL,nn | LDI (HL),A | INC HL | INC H | DEC H | LD H,n | DAA | JR Z,n | ADD HL,HL | LDI A,(HL) | DEC HL | INC L | DEC L | LD L,n | CPL |
3x | JR NC,n | LD SP,nn | LDD (HL),A | INC SP | INC (HL) | DEC (HL) | LD (HL),n | SCF | JR C,n | ADD HL,SP | LDD A,(HL) | DEC SP | INC A | DEC A | LD A,n | CCF |
4x | LD B,B | LD B,C | LD B,D | LD B,E | LD B,H | LD B,L | LD B,(HL) | LD B,A | LD C,B | LD C,C | LD C,D | LD C,E | LD C,H | LD C,L | LD C,(HL) | LD C,A |
5x | LD D,B | LD D,C | LD D,D | LD D,E | LD D,H | LD D,L | LD D,(HL) | LD D,A | LD E,B | LD E,C | LD E,D | LD E,E | LD E,H | LD E,L | LD E,(HL) | LD E,A |
6x | LD H,B | LD H,C | LD H,D | LD H,E | LD H,H | LD H,L | LD H,(HL) | LD H,A | LD L,B | LD L,C | LD L,D | LD L,E | LD L,H | LD L,L | LD L,(HL) | LD L,A |
7x | LD (HL),B | LD (HL),C | LD (HL),D | LD (HL),E | LD (HL),H | LD (HL),L | HALT | LD (HL),A | LD A,B | LD A,C | LD A,D | LD A,E | LD A,H | LD A,L | LD A,(HL) | LD A,A |
8x | ADD A,B | ADD A,C | ADD A,D | ADD A,E | ADD A,H | ADD A,L | ADD A,(HL) | ADD A,A | ADC A,B | ADC A,C | ADC A,D | ADC A,E | ADC A,H | ADC A,L | ADC A,(HL) | ADC A,A |
9x | SUB A,B | SUB A,C | SUB A,D | SUB A,E | SUB A,H | SUB A,L | SUB A,(HL) | SUB A,A | SBC A,B | SBC A,C | SBC A,D | SBC A,E | SBC A,H | SBC A,L | SBC A,(HL) | SBC A,A |
Ax | AND B | AND C | AND D | AND E | AND H | AND L | AND (HL) | AND A | XOR B | XOR C | XOR D | XOR E | XOR H | XOR L | XOR (HL) | XOR A |
Bx | OR B | OR C | OR D | OR E | OR H | OR L | OR (HL) | OR A | CP B | CP C | CP D | CP E | CP H | CP L | CP (HL) | CP A |
Cx | RET NZ | POP BC | JP NZ,nn | JP nn | CALL NZ,nn | PUSH BC | ADD A,n | RST 0 | RET Z | RET | JP Z,nn | Ext ops | CALL Z,nn | CALL nn | ADC A,n | RST 8 |
Dx | RET NC | POP DE | JP NC,nn | XX | CALL NC,nn | PUSH DE | SUB A,n | RST 10 | RET C | RETI | JP C,nn | XX | CALL C,nn | XX | SBC A,n | RST 18 |
Ex | LDH (n),A | POP HL | LDH (C),A | XX | XX | PUSH HL | AND n | RST 20 | ADD SP,d | JP (HL) | LD (nn),A | XX | XX | XX | XOR n | RST 28 |
Fx | LDH A,(n) | POP AF | XX | DI | XX | PUSH AF | OR n | RST 30 | LDHL SP,d | LD SP,HL | LD A,(nn) | EI | XX | XX | CP n | RST 38 |
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0x | RLC B | RLC C | RLC D | RLC E | RLC H | RLC L | RLC (HL) | RLC A | RRC B | RRC C | RRC D | RRC E | RRC H | RRC L | RRC (HL) | RRC A |
1x | RL B | RL C | RL D | RL E | RL H | RL L | RL (HL) | RL A | RR B | RR C | RR D | RR E | RR H | RR L | RR (HL) | RR A |
2x | SLA B | SLA C | SLA D | SLA E | SLA H | SLA L | SLA (HL) | SLA A | SRA B | SRA C | SRA D | SRA E | SRA H | SRA L | SRA (HL) | SRA A |
3x | SWAP B | SWAP C | SWAP D | SWAP E | SWAP H | SWAP L | SWAP (HL) | SWAP A | SRL B | SRL C | SRL D | SRL E | SRL H | SRL L | SRL (HL) | SRL A |
4x | BIT 0,B | BIT 0,C | BIT 0,D | BIT 0,E | BIT 0,H | BIT 0,L | BIT 0,(HL) | BIT 0,A | BIT 1,B | BIT 1,C | BIT 1,D | BIT 1,E | BIT 1,H | BIT 1,L | BIT 1,(HL) | BIT 1,A |
5x | BIT 2,B | BIT 2,C | BIT 2,D | BIT 2,E | BIT 2,H | BIT 2,L | BIT 2,(HL) | BIT 2,A | BIT 3,B | BIT 3,C | BIT 3,D | BIT 3,E | BIT 3,H | BIT 3,L | BIT 3,(HL) | BIT 3,A |
6x | BIT 4,B | BIT 4,C | BIT 4,D | BIT 4,E | BIT 4,H | BIT 4,L | BIT 4,(HL) | BIT 4,A | BIT 5,B | BIT 5,C | BIT 5,D | BIT 5,E | BIT 5,H | BIT 5,L | BIT 5,(HL) | BIT 5,A |
7x | BIT 6,B | BIT 6,C | BIT 6,D | BIT 6,E | BIT 6,H | BIT 6,L | BIT 6,(HL) | BIT 6,A | BIT 7,B | BIT 7,C | BIT 7,D | BIT 7,E | BIT 7,H | BIT 7,L | BIT 7,(HL) | BIT 7,A |
8x | RES 0,B | RES 0,C | RES 0,D | RES 0,E | RES 0,H | RES 0,L | RES 0,(HL) | RES 0,A | RES 1,B | RES 1,C | RES 1,D | RES 1,E | RES 1,H | RES 1,L | RES 1,(HL) | RES 1,A |
9x | RES 2,B | RES 2,C | RES 2,D | RES 2,E | RES 2,H | RES 2,L | RES 2,(HL) | RES 2,A | RES 3,B | RES 3,C | RES 3,D | RES 3,E | RES 3,H | RES 3,L | RES 3,(HL) | RES 3,A |
Ax | RES 4,B | RES 4,C | RES 4,D | RES 4,E | RES 4,H | RES 4,L | RES 4,(HL) | RES 4,A | RES 5,B | RES 5,C | RES 5,D | RES 5,E | RES 5,H | RES 5,L | RES 5,(HL) | RES 5,A |
Bx | RES 6,B | RES 6,C | RES 6,D | RES 6,E | RES 6,H | RES 6,L | RES 6,(HL) | RES 6,A | RES 7,B | RES 7,C | RES 7,D | RES 7,E | RES 7,H | RES 7,L | RES 7,(HL) | RES 7,A |
Cx | SET 0,B | SET 0,C | SET 0,D | SET 0,E | SET 0,H | SET 0,L | SET 0,(HL) | SET 0,A | SET 1,B | SET 1,C | SET 1,D | SET 1,E | SET 1,H | SET 1,L | SET 1,(HL) | SET 1,A |
Dx | SET 2,B | SET 2,C | SET 2,D | SET 2,E | SET 2,H | SET 2,L | SET 2,(HL) | SET 2,A | SET 3,B | SET 3,C | SET 3,D | SET 3,E | SET 3,H | SET 3,L | SET 3,(HL) | SET 3,A |
Ex | SET 4,B | SET 4,C | SET 4,D | SET 4,E | SET 4,H | SET 4,L | SET 4,(HL) | SET 4,A | SET 5,B | SET 5,C | SET 5,D | SET 5,E | SET 5,H | SET 5,L | SET 5,(HL) | SET 5,A |
Fx | SET 6,B | SET 6,C | SET 6,D | SET 6,E | SET 6,H | SET 6,L | SET 6,(HL) | SET 6,A | SET 7,B | SET 7,C | SET 7,D | SET 7,E | SET 7,H | SET 7,L | SET 7,(HL) | SET 7,A |
In the UK, Website Payments Pro is implemented as a REST or SOAP API, to which a website can connect and request transactions. There are, however, some pitfalls to implementing these requests:
I fell into all of the above traps when implementing WPP, so I produced the following article for reference purposes, and to serve as a coherent source of documentation for implementatin of Website Payments Pro. Please note that throughout, the API revision used is 56.0, and REST calls will be used.
The PayPal REST API is called by POSTing a request to a secure (HTTPS) URI, with credentials generated by the holder of a "business" PayPal account. The credentials consist of an API "user" based on the account holder's email address, a credential password, and an encoded "signature" used as an additional checksum by the API. An example set of credentials would look as follows:
USER: tf_api1.imrannazar.com PWD: QF63NP99NPER3V7A SIGNATURE: AZM6n0EcNmR0AQYsCf0s1VrwkV10AlKArJ7a8X4YHG-R2oFkOwGqVrJZ VERSION: 56.0
These variables are passed, along with the transaction request parameters, in a standard POST-formatted query to the previously mentioned HTTPS URI. The API will return a POST-formatted string with the result of the transaction request, which can be split manually or with a scripting language's built-in functions. The examples in this article are written in PHP, which provides functions to build and to break up POST-format strings. The following code will produce and send a request to the PayPal API, parsing and returning the result.
class PaymentPaypal { const API_URI = 'https://api-3t.paypal.com/nvp';// PayPal API credential configurationprivate $config;// Data for this transaction (cart ID/contents/amount)private $data; function send($params) { $params['USER'] = $this->config['USER']; $params['PWD'] = $this->config['PWD']; $params['SIGNATURE'] = $this->config['SIGNATURE']; $params['VERSION'] = '56.0';// Fire up a POST request to PayPal$c = curl_init(); curl_setopt_array($c, array( CURLOPT_URL => self::API_URI, CURLOPT_FAILONERROR => true, CURLOPT_RETURNTRANSFER => true, CURLOPT_FOLLOWLOCATION => true, CURLOPT_POST => true, CURLOPT_POSTFIELDS => http_build_query($params) )); $result = curl_exec($c); if(!$result) {// Request failed at HTTP time; return the cURL errorreturn array('ERROR' => curl_error($c)); } else {// Request returned; break out the response into an arraycurl_close($c): $r = array(); parse_str($result, $r); return $r; } } }
The config
member variable for this class can be filled at construction time, and its handling is not shown here.
As shown above, a rudimentary catch can be made if PayPal refuses to respond properly, since cURL will produce an error in such a circumstance. However, if the request succeeds but PayPal returns an error, these errors must also be accounted for. PayPal allows for this by sending error messages back with the response as appropriate.
If any error messages are provided in the response, they will be provided in three formats:
L_ERRORCODEx
: An error code for referencing against PayPal's list of error codes and messages;L_SHORTMESSAGEx
: A short technical description of the error;L_LONGMESSAGEx
: A long description which can be printed direct to the user.PayPal recommend that if a message is to be printed to the user when you trap an error, it should be the long message, since the short message may be difficult to understand. An example of such a situation is error code 10537
, "Risk Control Country Filter Failure". The long message for this is currently documented as: "The transaction was refused because the country was prohibited as a result of your Country Monitor Risk Control Settings", which is eminently more understandable than the short message or the error code itself.
Since a PayPal transaction may result in more than one error, each set of error code and messages is suffixed with a number, denoted with "x" above. The numbers start at 0, which means that any failure in an API call is usually documented in the L_LONGMESSAGE0
part of the response. If L_LONGMESSAGE0
doesn't sufficiently explain the cause of the error, your code can check for any other errors that may have arisen.
As previously mentioned, PayPal allows users of Website Payments Pro to use either Express Checkout (EC) or Direct Payments (DP) to authenticate a transaction. In order to provide the choice, and to take credit card details if DP is to be used, a form must be presented to the user stating how much is to be paid, and showing the two methods.
PayPal stipulate the following requirements for this input form:
The form may be rendered however you wish, and the billing name and address can be taken beforehand from a previously registered account or provided directly with the card details in the DP form. An example would look as follows.
The Express Checkout process runs in three stages, allowing a user to log in to PayPal and use their account balance to pay for the transaction. Once they've logged in, they can confirm their information such as a shipping address if the website requires it, and then the transaction is sent through the API to be charged against the PayPal account.
The first step, referred to as SetExpressCheckout
in the documentation, is to generate a transaction "token" that will allow both your site and PayPal to track through the three steps: this is done by calling PayPal through the API and requesting an EC token. Once the token has been generated, the user is forwarded to PayPal so that they can log in, and PayPal will then return the user to your site.
Req/Ret | Name | Value |
---|---|---|
Request | TRXTYPE | S |
Request | ACTION | S |
Request | AMT | Amount of the transaction |
Request | CURRENCYCODE | Currency of the transaction (GBP) |
Request | RETURNURL | URL for PayPal to direct to for stage 2 |
Request | CANCELURL | URL for PayPal to direct to if cancelled |
Return | ACK | Success or Failure |
Return | TOKEN | Generated token |
If the API return comes back with an ACK
value of "Failure", the return will not contain a TOKEN
; this can be used to check whether the request for a token succeeded.
The URL parameters to the stage-1 request allow a website to know whether a user has proceeded to step 2, or has cancelled the transaction at the PayPal side. By passing transaction information like the ID through these URLs, the website can be informed and take the appropriate action, such as marking a transaction as "Cancelled" if the CANCELURL
is triggered. The URLs shown for this purpose below link into a ficticious routing framework, but they can be modified to match your configuration.
class PaymentPaypal { function express_stg1() { $params = array( 'TRXTYPE' => 'S', 'ACTION' => 'S', 'AMT' => $this->data['transaction_amount'], 'CURRENCYCODE' => 'GBP', 'RETURNURL' =>' ($this->config['SITEBASE'].'/checkout/ec2/'.$this->data['transaction_id']), 'CANCELURL' => ($this->config['SITEBASE'].'/checkout/cancel/'.$this->data['transaction_id']) ); $response = $this->send($params); if($response['ACK'] == 'Failure' || !isset($response['TOKEN'])) {// Request failed; return errorreturn array( 'status' => 'FAIL', 'msg' => $response['L_LONGMESSAGE0'] ); } else {// Request successful; forward user to PayPal and end scriptheader('Location: https://www.paypal.com/cgi-bin/webscr?cmd=_express-checkout&token='.$response['TOKEN']); die('FORWARD'); } } }
Once the user has logged into PayPal and returned to your site, PayPal allow for the retrieval of the token and further redirection for entering a shipping address and related details, through the step named GetExpressCheckout
. This can be advantageous if your site doesn't have the ability to tie a billing and shipping address against a given user's profile, but PayPal state that this second step can be skipped if these details are already on file against a profile on your site. I've found it simpler to deal with these parts of the profile directly, than forward through PayPal again, so stage 2 of Express Checkout is simply a button to take the user to stage 3. This can be implemented, of course, as a direct link to stage 3 using the RETURNURL
parameter to the stage-1 call.
Either way that this is implemented, the URL returned to will be provided with two parameters on the GET line: the original EC token generated in stage 1, and a PayerID
which corresponds to the login given by the user to PayPal. The third stage of Express Checkout, DoExpressCheckout
, does the actual work of firing a transaction through the API using the authenticated token provided by the PayPal user, and uses both of these as parameters to the API call.
Req/Ret | Name | Value |
---|---|---|
Request | TRXTYPE | S |
Request | ACTION | D |
Request | AMT | Amount of the transaction |
Request | CURRENCYCODE | Currency of the transaction (GBP) |
Request | TOKEN | EC token generated in stage 1 |
Request | PAYERID | PayerID returned to stage 2 |
Request | PAYMENTACTION | Sale or Authorization |
Return | RESULT | 0 if successful, positive for comms error, negative for declined |
Return | RESPMSG | Short description of EC result |
The only new parameter here is PAYMENTACTION
; this allows for a PayPal account to be checked for authorisation without proceeding with a full sale, and can be useful for testing purposes as well as advanced purposes such as recurring billing and invoicing. Such features of the Express Checkout are beyond the scope of this article, but PayPal provide a PDF describing EC integration which goes into some detail about these. (Note, however, that this documentation is outdated in its description of the basic sending of API requests; the method described in this article is more current.) For the moment, it's sufficient to set this to Sale
and request a full transaction every time.
In addition to checking the standard error codes for a PayPal response, it's prudent to check the RESULT
of the stage-3 call, to ensure that it comes back as a successful transaction (zero). If another value is set, the RESPMSG
will describe what happened to the transaction, such as "Declined".
class PaymentPaypal { function express_stg3($token, $payerid) { $params = array( 'TRXTYPE' => 'S', 'ACTION' => 'D', 'AMT' => $this->data['transaction_amount'], 'CURRENCYCODE' => 'GBP', 'TOKEN' => $token, 'PAYERID' => $payerid, 'PAYMENTACTION' => 'Sale' ); $response = $this->send($params); if(isset($response['L_ERRORCODE0']) || $response['RESULT'] != 0 || !isset($response['TOKEN'])) { return array( 'status' => 'FAIL', 'msg' => $response['RESPMSG'] ); } else { return array( 'status' => 'PASS', 'msg' => 'Transaction complete' ); } } }
The alternative to Express Checkout is Direct Payment, the immediate charging of a credit or debit card without the user needing to sign up to PayPal. DP takes a set of card details, along with a billing name and address, and sends them through PayPal; the response will either be a successful charge against the card, or a failure with one of a number of reasons: mismatched billing name, invalid card number, and so forth. Because a good deal of information is asked for by DP, all the request fields below are required unless marked otherwise.
Req/Ret | Name | Value |
---|---|---|
Transaction details | ||
Request | TRXTYPE | S |
Request | TENDER | C |
Request | AMT | Amount of the transaction |
Request | CURRENCYCODE | Currency of the transaction (GBP) |
Request | METHOD | DoDirectPayment |
Request | PAYMENTACTION | Sale |
Request | IPADDRESS | User's remote IP |
Card details | ||
Request | CREDITCARDTYPE | Type of card (Visa, MasterCard, Amex, Maestro, Solo) |
Request | ACCT | Card number (12-20 digits) |
Request | EXPDATE | Expiry date (MMYYYY, month is 01-12) |
Request | STARTDATE | Start date (MMYYYY, month is 01-12) Required for Maestro and Solo cards only |
Request | CVV2 | Card security code (3-6 digits) |
Request | FIRSTNAME | Cardholder's forename |
Request | LASTNAME | Cardholder's surname |
Request | STREET | Billing house number/name and street |
Request | STREET2 | [Optional] Second billing address line |
Request | CITY | Billing address town/city |
Request | ZIP | Billing address postcode |
Request | COUNTRYCODE | Card issuing country (GB) |
Return | ||
Return | ACK | Success or Failure |
Return | TRANSACTIONID | Alphanumeric ID given by PayPal |
Return | TIMESTAMP | ISO-formatted time and date of transaction |
Return | AMT | Amount of the transaction |
Return | CURRENCYCODE | Currency of the transaction (GBP) |
PayPal will request address and CVV matching against the card when you send it through for processing; the results of these matches will be provided in the response along with the other parameters above, as the AVSADDR
, AVSZIP
and CVV2MATCH
response values. Each of these will have one of the following characters as its value:
PayPal won't decline a transaction if these fail, but you can record these values against the transaction in case anything comes of them. The important flag, as with Express Checkout, is ACK
; this will indicate whether the charge succeeded.
class PaymentPayPal { function direct($cc) { $params = array( 'TRXTYPE' => 'S', 'TENDER' => 'C', 'AMT' => $this->data['transaction_amount'], 'CURRENCYCODE' => 'GBP', 'METHOD' => 'DoDirectPayment', 'PAYMENTACTION' => 'Sale', 'IPADDRESS' => $_SERVER['REMOTE_ADDR'], 'CREDITCARDTYPE' => $cc['type'], 'ACCT' => $cc['number'], 'EXPDATE' => sprintf('%02d%04d', $cc['expmonth'], $cc['expyear'], 'CVV2' => $cc['cvv'] 'FIRSTNAME' => $this->data['user_fname'], 'LASTNAME' => $this->data['user_sname'], 'STREET' => $this->data['user_adstreet'], 'CITY' => $this->data['user_adtown'], 'ZIP' => $this->data['user_adpostcode'], 'COUNTRYCODE' => 'GB', );// Fill in the start date if requiredif($cc['type'] == 'Maestro' || $cc['type'] == 'Solo') { $params['STARTDATE'] = sprintf('%02d%04d', $cc['startmonth'], $cc['startyear']); } $response = $this->send($params); if(isset($response['L_ERRORCODE0']) || $response['ACK'] == 'Failure') { return array( 'status' => 'FAIL', 'msg' => $response['L_LONGMESSAGE0'] ); } else { return array( 'status' => 'PASS', 'msg' => 'Transaction complete' ); } } }
A few issues may come up while using the above routines; a couple of the most common ones I came across are documented below.
That's pretty much all you need to know in order to send a transaction through PayPal Website Payments Pro. I haven't covered the more advanced aspects, such as recurring billing and refunds, but these are little more than different TRXTYPE
's and are adequately documented as such in PayPal's own documentation. Just be aware that the authentication methods have changed between various revisions of the API: if the documentation asks that you send a VENDOR
value, you can safely ignore it.
Imran Nazar <tf@imrannazar.com>, Sep 2009
]]>The launch had happened a few months before; a Paludis III rocket had blasted away from Guinea base, carrying a comms satellite and the HT probe. The probe had unfurled right on schedule, and started on its push to the Lagrange point that would be its destination. It was now reporting that it was approaching L5, and that it was ready to spool up the experiment.
James Kent was quite excited. Even though it was 4am local time, he hadn't been able to sleep knowing that HT was getting close to starting its job; he'd driven in to the lab to watch the data as it came in. As he arrived at the lab, it was obvious that no-one else involved on the project had been able to get any rest, either.
"Morning", called a voice over his shoulder as James sat down. It was Mike Rampton, the flight director; Mike would be overseeing operations in the command centre, while James was in charge of crunching the numbers to make sure the project ran smoothly.
"It's not morning yet, Mike. Status?" asked James.
"Looking good. HT's braking in towards L5 right now; obviously, our telemetry's about a minute delayed, so I'd say it's holding in place right now. We've secured the lunar scopes you wanted, so we should be able to see things as they happen."
James had asked for a day's worth of time on the new lunar telescopes sent up by Europe, mostly because it was difficult to see the Terran L5 point from Earth itself. It had set the project back a fair bit of money, but the telescopes had been contracted out to them for today, and both James and Mike intended to make good use of them.
"Good thing about the 'scopes is, we can see HT clear as day. With no atmosphere to cover up the view, our results will show up pretty well", Mike concluded.
"Sounds good, Mike. If we've arrived at L5, I'll start calibrating", James said. He sipped at a mug of coffee, and started inputting models into the computer on his desk.
While James crunched the coordinates, Mike confirmed that the probe had reached the Lagrange point. It had been quite easy for HT to slingshot across to the fifth point, where the Earth had been two months ago; little fuel had been expended on the journey, and the majority was going to be used in braking towards the gravitational stability point.
Theory held that at L5 (and conversely, at L4), one could plant a probe or satellite and have it stay there, rather than floating in towards the Earth or Sun. It would, of course, be moving at a speed equal to the Earth, since the Lagrange points moved with the Earth as it orbited; however, this would be the only speed required for calculations. That was why James had originally asked for one of these points to be used as the basis of the experiment.
Before too long, James indicated that he had calibrated things, and they were ready. Mike had adjusted one of the lunar 'scope feeds to show HT, in close-up, as it was parked at L5; the other 'scope was pointed at L4, on the other side of Earth's orbit, where it would be in two months' time. Mike sent out the signal for the probe to begin spinning up.
"We'll be able to see the spin-up in ahout two minutes' time, given signal delays. Should be another minute before things are at full speed", Mike stated. "Let's hope you've got all the gravities right, James."
"We're lucky that Jupiter is on the other side of the Sun right now, otherwise things would've got quite complicated with the coordinate calculation. As it is, we should end up more or less as expected", James replied.
As James continued with his coffee, HT returned that it was fully spun up.
"Drive?" Mike asked over the comm.
"Reporting 100%, caps are full. Settings programmed in; we're ready down here."
"Alright; fire it up."
Nothing happened; not immediately, at least. The telescopes still showed HT hanging in place at L5, for a minute or so. The tension mounted in the command room; James had a niggling sense that something might be off on his calculation, and HT would be destroyed or vanish never to be seen again. As quickly as the thought surfaced, it was discounted: a bright point flashed on the screen, and HT was gone. It had taken a minute for light to come from the L5 point to the lunar telescopes, and another few seconds for the images to relay to the command centre. HT was away.
A bright flash on the other monitor, and HT appeared. The stars behind it were different, however: the probe was now at L4, a full twenty million miles away. These images were also coming from the Moon, which meant that the move between Lagrange points had been almost instantaneous.
"What's that, 0.2 AU in half a second? Pretty quick, James", Mike laughed.
"And right where we wanted it, Mike. Let's get HT home and have a look at the insides; I wanna see what hyperspace did to my bacterial samples."
]]>The trouble starts to come in when the interface wishes to return a complicated result: a file that isn't XML or plain text, or a series of files in the same return package. There are various encodings used in the transfer of complex SOAP results; one of the more common is Direct Internet Message Encapsulation, or DIME.
DIME is essentially a wrapper over MIME, allowing multiple MIME parts to be sent in one package. The format was developed by Microsoft as a draft standard, and was adopted by a good number of SOAP interfaces before the official data standard was drawn up. The concept of the format is very simple: a series of files, either XML or binary data, with a short header on each file.
Each part can be marked as the first and/or last part of the message, and the type of data it contains can also be marked as XML or binary. Being a Microsoft standard, the definition of the header for each part involves bitfields and binary fiddling. There's also scope provided for extensions to the DIME format, but none of these were ever defined, so you're unlikely to find any messages with options filled in.
Field | Length | Description |
---|---|---|
Version | 5 bits | DIME format version (always 1) |
First Record | 1 bit | Set if this is the first part in the message |
Last Record | 1 bit | Set if this is the last part in the message |
Chunk Record | 1 bit | This file is broken into chunked parts |
Type Format | 4 bits | Type of file in the part (1 for binary data, 2 for XML) |
Reserved | 4 bits | (Classic Microsoft) |
The following fields are big-endian numbers | ||
Options Length | 2 bytes | Length of the "options" field |
ID Length | 2 bytes | Length of the "ID" or "name" field |
Type Length | 2 bytes | Length of the "type" field |
Data Length | 4 bytes | Size of the included file |
The following fields are variable-length, and padded to the next 4-byte boundary | ||
Options | Part-specific option data, if any is defined (safely answer "no") | |
ID | Name of the file/part | |
Type | If typeformat is 1: MIME type of the part data If typeformat is 2: URI of the DTD file for the XML enclosed | |
Data | The file |
As you can see, it's possible for a DIME message to contain only one part, by marking it as both the first and the last. Each part follows directly on from the last, so it's easy enough to run through a DIME message in a loop, working out the position of each part by finding out where you end up after adding up the sizes of the four sections and the header (which is 12 bytes).
One complication of the format is that each section in a part (options, ID, type, data) is padded, so that it takes up an even multiple of 4 bytes; this is generally done by filling the gap with "0" bytes. For example, if the type of the file was given as "text/html", you'd end up with the following in the message:
74 65 78 74 2F 68 74 6D 6C00 00 00 -- text/html
The green area above is the field data itself, defined by the header as 9 bytes long. The next multiple of 4 from there is 12, so three bytes of padding are added to push the field to an even boundary; these bytes are not counted as part of the field data.
Using the structure pattern in PHP, it's quite a simple endeavour to build a class capable of reading in DIME messages and extracting the parts. The basis of this is the DIMERecord structure.
class DIMERecord { public $version; public $first; public $last; public $chunked; public $type_format; public $options; public $id; public $type; public $data; }
Filling in this structure can be done from another class, acting as the DIME parser itself. It's this class which holds the array of DIMERecords referencing the parts.
class DIME { const TYPE_BINARY = 1; const TYPE_XML = 2; public $records; function __construct($input) { $this->records = array(); $pos = 0; do { $r = new DIMERecord;// Shift out bitfields for the first fields$b = ord($input[$pos++]); $r->version = ($b>>3) & 31; $r->first = ($b>>2) & 1; $r->last = ($b>>1) & 1; $r->chunked = $b & 1; $r->type_format = (ord($input[$pos++]) >> 4) & 15;// Fetch big-endian lengths$lengths = array(); $lengths['options'] = ord($input[$pos++]) << 8; $lengths['options'] |= ord($input[$pos++]); $lengths['id'] = ord($input[$pos++]) << 8; $lengths['id'] |= ord($input[$pos++]); $lengths['type'] = ord($input[$pos++]) << 8; $lengths['type'] |= ord($input[$pos++]); $lengths['data'] = ord($input[$pos++]) << 24; $lengths['data'] |= (ord($input[$pos++]) << 16); $lengths['data'] |= (ord($input[$pos++]) << 8); $lengths['data'] |= ord($input[$pos++]);// Read in padded dataforeach($lengths as $lk => $lv) { $r->$lk = substr($input, $pos, $lv); $pos += $lv; if($lv & 3) $pos += (4-($lv & 3)); } $this->records[] = $r; } while($pos < strlen($input)); } }
The DIME standard also accommodates the ability to break up a file across multiple parts, in case the client or server don't have the processing power to fill out a header for a big file all at once. Parsing a chunked file from its parts involves checking the "chunked" bit for the part being checked, and the part before it, making a decision based on the values:
This part chunked? | Previous part chunked? | Action |
---|---|---|
No | No | This is a normal file; save |
Yes | No | This is the first chunk part; start a data buffer |
Yes | Yes | This is a continuation chunk; append to the data buffer |
No | Yes | This is the last chunk part; append to the data buffer and save |
The type and id for the file are taken from the first chunk; any chunks after that have these fields set to zero, and have to be ignored. Implementing chunking involves extending the parser function, so that it holds a series of files as well as a series of records.
class DIMEFile { public $type_format; public $type; public $id; public $data; } class DIME { const TYPE_BINARY = 1; const TYPE_XML = 2; public $records; public $files; function __construct($input) { $this->records = array(); $pos = 0;// Break out parts from the message stringdo { $r = new DIMERecord;// Shift out bitfields for the first fields$b = ord($input[$pos++]); $r->version = ($b>>3) & 31; $r->first = ($b>>2) & 1; $r->last = ($b>>1) & 1; $r->chunked = $b & 1; $r->type_format = (ord($input[$pos++]) >> 4) & 15;// Fetch big-endian lengths$lengths = array(); $lengths['options'] = ord($input[$pos++]) << 8; $lengths['options'] |= ord($input[$pos++]); $lengths['id'] = ord($input[$pos++]) << 8; $lengths['id'] |= ord($input[$pos++]); $lengths['type'] = ord($input[$pos++]) << 8; $lengths['type'] |= ord($input[$pos++]); $lengths['data'] = ord($input[$pos++]) << 24; $lengths['data'] |= (ord($input[$pos++]) << 16); $lengths['data'] |= (ord($input[$pos++]) << 8); $lengths['data'] |= ord($input[$pos++]);// Read in padded dataforeach($lengths as $lk => $lv) { $r->$lk = substr($input, $pos, $lv); $pos += $lv; if($lv & 3) $pos += (4-($lv & 3)); } $this->records[] = $r; } while($pos < strlen($input));// Unchunk records into files, as required$previous_chunk = 0; foreach($this->records as $r) { if(!$r->chunked) { if(!$previous_chunk) {// Normal part$f = new DIMEFile; $f->type_format = $r->type_format; $f->type = $r->type; $f->id = $r->id; $f->data = $r->data; $this->files[] = $f; } else {// Final chunk$f->data .= $r->data; $this->files[] = $f; } } else { if(!$previous_chunk) {// First chunk$f = new DIMEFile; $f->type_format = $r->type_format; $f->type = $r->type; $f->id = $r->id; $f->data = $r->data; } else {// Continuation$f->data .= $r->data; } } $previous_chunk = $r->chunked; } } }
The JasperServer reporting service uses SOAP to allow requests for reports, and a DIME-encoded message to return the status message XML and the report itself as one result. The details for our example JasperServer are as follows:
WSDL URI | http://localhost:8080/jasperserver/services/repository?wsdl |
Namespace | http://www.jaspersoft.com/namespaces/php |
Request | runReport |
Report URI | /reports/inventory_list |
Using these access details, and passing them through PHP's native SOAP client, it's a simple matter to retrieve the DIME-encoded return message.
$request = '<?xml version="1.0" encoding="UTF-8"?> <request operationName="runReport" locale="en"> <argument name="RUN_OUTPUT_FORMAT">XLS</argument> <argument name="USE_DIME_ATTACHMENTS"><![CDATA[1]]></argument> <resourceDescriptor name="" wsType="reportUnit" uriString="/reports/inventory_list" isNew="false"> <label></label> </resourceDescriptor> </request>'; $c = new SoapClient( 'http://localhost:8080/jasperserver/services/repository?wsdl', array('trace' => true)); try { $c->__soapCall( 'runReport', array('request' => $request), array('namespace' => http://www.jaspersoft.com/namespaces/php)); } catch(SoapFault $cf) {// A DIME-encoded message has no text, generating an exception // Parse out the traced response, and get the file from there // Response should be one XML file, and one binary$dp = new DIME($c->__getLastResponse()); foreach($dp->files as $f) { if($f->type_format == DIME::TYPE_BINARY) { header('Content-type: '.$f->type); header('Content-disposition: attachment; filename="'.$f->id.'"'); echo $f->data; } } }
That's how you can use the DIME parser I've introduced here, to pull data out of a DIME-encoded SOAP response. As can be seen from the sample invocation here, all that's needed is to make a new DIME object from the message string, and check the array of files that's generated as a result.
Imran Nazar <tf@imrannazar.com>, Jul 2009
]]>from __future__ import leg
On Error Resume Next
With thanks to: letusgothen, TehLaser, VoidBoi, jercos, matja, apathy, asonge, Ended, tobias104, Emu*, MHD, lulzfish, Ysn, maafy6, RoadieRich.
]]>Many sites introduce a dynamic element to their content by including a "slide-show" subsection: a list of items or pictures which can be scrolled through and viewed in portions, and which repeats to the first item when the last item is visible. An example of such a slide-show is show below.
In the above sample, the window is controlled by left and right anchors, which each give a direction for the window to move in by one step. It may seem complicated to implement such a slide-show, but this article will show you the basic pieces of code that go into making the slide-show work.
In order for the slide-show to operate, it's important to conceptualise what must be achieved. Each item in the section is joined horizontally to the next, to form a line of items. The user will see this line through a "window" which slides over the line, making a small portion visible.
The task of the slide-show is to control where this window lies, and when to move it. This can be achieved through simple use of HTML and JavaScript: a DIV block will act as the window, and the list of items is contained within the window.
<div id="window" style="overflow:hidden"> <ul> <li><img src="item1.jpg"></li> <li><img src="item2.jpg"></li> <li><img src="item3.jpg"></li> <li><img src="item4.jpg"></li> <li><img src="item5.jpg"></li> </ul> </div>
The inline style placed on the window DIV demonstrates what will occur if the window is set to be a smaller width than the list contained within: the remainder of the list will be cut-off, and hidden from view.
In order to move the window over the list of items, repeated small steps must be taken: moving the window in one operation will result in a simple change of view, as opposed to the desired slide. This effect of repeated steps can be achieved by the judicious use of timeouts: the function responsible for making a small step sets a timer which will, after a short while, run the same function again to make the next step.
For this to work properly, the function must be able to detect when a full item has been scrolled into view, and it's time to stop moving further. In this article, I'll assume that the items in the list are all the same width, and that this width has been defined for the JavaScript function; it's possible, though more time-consuming, to record the positions of the list items in relation to each other, and detect the edge of an item by checking the window position against these item positions.
timestep = 50; pos = 0; posstep = 4; curtravel = 0; step = function() { timer = setTimeout(function(){step();}, timestep); curtravel += posstep; if(curtravel >= itemwidth) { curtravel = 0; clearTimeout(timer); } pos += posstep; if(pos >= listwidth) pos = 0; document.getElementById('window').style.left = pos+'px'; };
Note that in the above code, the window movement code will detect whether the window has reached the end of the list; if so, it will wrap to the start, and the first item will again be displayed.
A problem presents itself with this simple approach: when approaching the end of the list, blank areas are shown before the first item appears once more. This is because the window is being positioned such that a part of it is past the end of the list, whereas the first item of the list is (as expected) at the start.
This issue can be alleviated by repositioning the items in the list, to allow for the first item to be rendered after the end of the list, and thus for the window to encapsulate the area while showing that the list is wrapping to the start. In order to do this, the items in the list must have their positions checked for each step through the movement of the window: if the item is positioned on the "wrong" end of the list to the window movement, it must be moved to the other side.
#window ul { position: relative; } #window li { position: absolute; }
step = function() { /* Initialise a timer, to do the step after this one */ timer = setTimeout(function(){step();}, timestep); /* Set the window's new position */ pos += posstep; if(pos >= listwidth) pos = 0; /* Check each item in the list, to see if it's outside the bounds of the window and if so, move it to the other side of the window to allow for scroll-in */ items = document.getElementById('window').getElementsByTagName('ul')[0].getElementsByTagName('li'); for(var i=0; i<items.length; i++) { if(pos >= 0) { itempos = i*itemwidth; if(itempos+pos > windowwidth) itempos -= listwidth; if(iempos+pos < -(windowwidth+itemwidth)) itempos += listwidth; items[i].style.left = itempos+'px'; } } /* Check if we're at the end of a scroll; if so, stop the timer */ curtravel += posstep; if(curtravel >= itemwidth) { curtravel = 0; clearTimeout(timer); } /* Set the new window position */ document.getElementById('window').style.left = pos+'px'; };
The above code allows for the window to move rightwards over the list, which corresponds to a visual effect of pushing the list off-screen to the left, and bringing more items on-screen from the right. It's important that the code also allow for items to be scrolled to the right, which would involve moving the window leftwards over the list. This can be achieved by modifying the step function somewhat, to include clauses for the reverse direction of travel.
step = function(stepdir) { timer = setTimeout(function(){step(stepdir);}, timestep); items = document.getElementById('window').getElementsByTagName('ul')[0].getElementsByTagName('li'); if(stepdir < 0) { /* Handle moving the window left; the same as before, but with all the comparison and addition signs flipped */ pos -= posstep; if(pos <= listwidth) pos = 0; for(var i=0; i<items.length; i++) { if(pos <= 0) { itempos = i*itemwidth; if(itempos+pos < -listwidth) itempos += listwidth; items[i].style.left = itempos+'px'; } } } else { /* Handle moving the window right; this code is the same as before */ pos += posstep; if(pos >= listwidth) pos = 0; for(var i=0; i<items.length; i++) { if(pos >= 0) { itempos = i*itemwidth; if(itempos+pos > windowwidth) itempos -= listwidth; if(iempos+pos < -(windowwidth+itemwidth)) itempos += listwidth; items[i].style.left = itempos+'px'; } } } curtravel += posstep; if(curtravel >= itemwidth) { curtravel = 0; clearTimeout(timer); } document.getElementById('window').style.left = pos+'px'; };
Now all that must be remembered is that moving left entails a positive step direction, and that moving right is a negative step direction.
In principle, it's quite easy to adapt this slideshow to travel vertically instead of horizontally. However, the fact that this system relies on a fixed width for each list item is a disadvantage when it comes to vertical travel: as can be seen in the above sample, items will differ in height depending on their content.
To alleviate this, the system previously mentioned can be employed: a list is maintained of the positions of the items in the list, and these are used to check the bounds of the window. If an item's initial position combined with the window's position is outside the window, this will force it to move to the other side of the list.
This system can easily be extended to allow for image slideshows, by changing the contents of the list to a list of images of the same width; be combining the images with anchors, a thumbnail slideshow can be created. Feel free to extend the code to your heart's content.
Imran Nazar <tf@imrannazar.com>, 2009
]]>The place was palatable enough, if a bit lightweight: bed, shower, toilet cubicle, microwave oven. The east-facing wall was a viewscreen, which he could switch to transparent if he wanted to see what was going on outside, or to any of the dozens of full-definition channel feeds. There was, however, no way for him to talk to the outside: no terminals, no data sockets. Even his cellphone didn't work in here: it didn't seem to be picking up a signal.
He was on the ground floor of an apartment block, in room 106. The block was eight stories high, and there were eight apartments on each floor. This was Borneo Block 1, based in the equatorial country of the same name, and was the ideal spot to put such a place as this.
Some of the apartments were occupied by families, some by lone residents such as himself; none were empty. And none could be opened from the inside; once they were sealed from the outside, only an extraordinary emergency could open the doors. That was hardly likely to happen here, with no other buildings within five miles of the block.
It was a prison. A comfortable prison, with a short stay, but a prison none the less.
Every resident of the block would be here for four days, after which time a new list of tenants would be drawn up and moved in. Ryan had just moved in, so he settled down on the bed, identical to the one in 303 where he'd been placed last time, and flipped the viewscreen to Pacific News.
Four days later, it was time to leave. As Ryan stepped out of the shower, the apartment door unsealed and opened by itself. He dressed quickly, and made to leave. As he stepped out of the apartment, the door sealed itself shut again. He was met by Paul, manager of the summer shift at Borneo Block.
"Welcome back, Ryan. How's Earth?"
"Still there," he replied.
The apartment block began to draw away from them both; as it retreated, Ryan could see a blue border encroach on the edges of the block. The border got larger as the block dwindled, and intrusions of white and brown appeared in spots and wisps.
Ryan kept one eye on the apartment block as it fell back towards Earth, his temporary home for what the space elevator engineers called Lift Time; four days was the quickest comfortable journey up to orbit, but the block would fall much more quickly with no residents to cater for.
"We've got some retensioning to do on Four, if you want to jump straight in," Paul stated. Ryan was a maintenance engineer for the Borneo elevator, and cable tension was a part of his job.
"Let's get started, then."
]]>He opened his eyes. Expecting to see the dark green of his tent over him, he found a blue sky, tinged with the orange of a rising sun. He was indeed in the open, so where was his tent?
He sat up, rubbing his eyes, trying to focus. Around him, there was just grass; it was an open field, and he was apparently asleep right in the middle. He couldn't remember finding this field; even if he had picked this place to sleep overnight, his tent would've been over him, and he'd be nearer the woods. Maybe the tent blew away last night, but he couldn't see it now. He'd have to find another at some point.
He looked behind him, and there was a house in the distance. With the sun behind it, lying in its own shadow, the house looked stark. He could see, though, that it was a wooden house. The walls were lime-washed, and it looked like some of the windows were broken. The front door had been boarded over at one point, but the board had fallen away on one side.
He felt himself being drawn to the house, for some reason. Maybe because the side window was open just enough for one person to get through, though anything useful was probably long gone. His plan was to head further south today; his old map showed a village by the road, which might prove a good source of food for the next couple of weeks.
He got up, and made ready to leave. Instead of heading south, he turned around to face the house. He found himself walking towards the open window, as though something was pushing him towards it; as though a command had been given.
> GO NORTHEAST
The most basic display elements on a HTML page are easy to understand: paragraphs, headings and tables. Often, a page is broken up into sections where each section contains these elements; these sections can be defined in the HTML source as blocks. The problem that arises from this is how to display the separate sections in an intuitive manner.
An example would be a login form, on which you can provide a username and password to log in. Another section of the page provides a "forgot-password" form, where an email address can be entered and a retrieval email sent. This page may be written in two sections, as below.
<h2>Login</h2> <form action="/login" method="post"> <fieldset> <legend>Provide login details</legend> <label for="user">Username:</label><input type="text" name="username" id="user"> <label for="pass">Password:</label><input type="password" name="password" id="pass"> <input type="hidden" name="do" value="login"> <input type="submit" name="go" value="Login"> </fieldset> </form> <h2>Forgot Password</h2> <form action="/login" method="post"> <fieldset> <legend>Provide e-mail</legend> <label for="email">E-mail:</label><input type="text" name="email" id="email"> <input type="hidden" name="do" value="forgot"> <input type="submit" name="go" value="Get Password"> </fieldset> </form>
Having these two forms directly in line with each other could cause some confusion for the user. There are a few ways to alleviate this: bringing the two forms alongside each other, for example, would allow a visual separation of the two functions. The most effective display method in this situation, however, is tabbing.
It's very likely that you've come across tabs before: they've been used by web browsers for many years as a way to display multiple pages in the same browser window. There are two components to a tabbing system: the tab list and the tab contents. Each entry in the tab list has an associated content block: when a given entry in the tab list is selected, the content block for that entry is displayed and the other content blocks are hidden away.
In the above example, four separate pages are open in the same web browser instance: the second is selected. As can be seen, it's obvious which tab is selected, and which contents are being displayed as a result. Tabs aren't just a graphical concept, however: they can be used equally well in a text-based environment.
In this example, four terminals are open in the multiple-terminal screen, and each one has an entry on the tab list at the bottom. The third terminal (an editor session) is currently selected, and the tab list reflects this by highlighting the third tab.
Tabs in a web page are visually very similar to these two interfaces; the above example of a login and forgot-password interface may be implemented as per the following diagrams.
As stated above, the tab contents must be sectioned before tabbing can be
applied; a tab list must also be present to allow switching between tabs. A
simple way to break the content down is by placing each section in a
DIV
. The correspondence between tab list item and tab content is
maintained by giving each DIV
an id
, which is used
as the rel
attribute on the list item. As described in the
JavaScript section, the tab switcher will use this rel
to
determine which tab content to switch in.
<ul class="tablist" id="tablist-login"> <li rel="tab-login">Login</li> <li rel="tab-forgot">Forgot Password</li> </ul> <div class="tab" id="tab-login"> <form action="/login" method="post"> <fieldset> <legend>Provide login details</legend> <label for="user">Username:</label><input type="text" name="username" id="user"> <label for="pass">Password:</label><input type="password" name="password" id="pass"> <input type="hidden" name="do" value="login"> <input type="submit" name="go" value="Login"> </fieldset> </form> </div> <div class="tab" id="tab-forgot"> <form action="/login" method="post"> <fieldset> <legend>Provide e-mail</legend> <label for="email">E-mail:</label><input type="text" name="email" id="email"> <input type="hidden" name="do" value="forgot"> <input type="submit" name="go" value="Get Password"> </fieldset> </form> </div>
Each tab in the tablist can be in one of two states: active (the currently selected tab) or inactive. In the above example, the tab list has been coded as an unordered list, which means that the list items must be floated next to each other if they are to appear on the same line.
The tab content DIV
is a simple matter to style: a black border
will suffice. The tab list, however, has to be positioned such that the
"active" tab will visually merge with the tab contents. The easiest way to do
this is to give the active tab and the tab content box the same background (in
this case, white), and to set a bottom border on the active tab of white. From
here, the tab list can be positioned to overlay the tab content, causing the
active tab's white border to visually override the content's black border.
In CSS, the implementation could be as follows.
/* Tab list: no bullets */ul.tablist { list-style: none inside; margin: 0; padding: 0; }/* Tab list item: floated, pushed down one pixel */ul.tablist li { display: block; float: left; background: #ddd; border-top: 1px solid #ddd; border-bottom: 1px solid black; position: relative; bottom: -1px; padding: 0.5em; margin-right: 2px; cursor: pointer; }/* Tab list item (active): white bottom border */ul.tablist li.active { background: white; border-left: 1px solid black; border-right: 1px solid black; border-top: 1px solid black; border-bottom: 1px solid white; }/* Tab: black border */div.tab { border: 1px solid black; clear: both; padding: 0.5em; }
The most important part of the tabbing system is the active component: that part which switches in a tab and switches out the others, when an item on the tab list is clicked. In order to do this, a mapping must be maintained of which tab list items are in a particular list; this map can be created at the time the page is loaded.
At initialisation time, each item in a tab list is also given an
onclick
function, to activate the switching mechanism when the
tab is clicked by the user. The mechanism is a simple loop, determining which
tab content boxes are to be switched, and hiding every tab except the one
requested.
tabSwitcher = { _map: {}, init: function() {// Check each UL on the page, to see if it's a tablistlists = document.getElementsByTagName('ul'); for(i=0; i<lists.length; i++) { if(lists[i].className.indexOf('tablist') >= 0) {// If we find a tablist, put each item in the mapitems = lists[i].getElementsByTagName('li'); for(j=0; j<items.length; j++) {// Map the item's REL attribute to this tablisttabSwitcher._map[items[j].getAttribute('rel')] = lists[i].id;// When the user clicks this item, run switcheritems[j].onclick = function() { tabSwitcher.action(this.getAttribute('rel')); return false; }; }// Leave this tab list in a default state of // first item activetabSwitcher.action(items[0].getAttribute('rel')); } } }, action: function(target) {// Fetch all the tab list items in the same list as the targettablist = document.getElementById(tabSwitcher._map[target]); listitems = tablist.getElementsByTagName('li'); for(k=0; k<listitems.length; k++) {// If this item's REL is the same as the clicked item, // activate the tab list item and show the contentrel = listitems[k].getAttribute('rel'); if(rel == target) { listitems[k].className = 'tab_hi'; document.getElementById(rel).style.display = 'block'; }// Otherwise, make the tab list item inactive and hide the contentelse { listitems[k].className = 'tab'; document.getElementById(rel).style.display = 'none'; } } } }; window.onload = tabSwitcher.init;
Putting all these code sections together provides:
Since the above JavaScript code is designed to map a tab list item to the list within which it's contained, it's possible to place multiple tab lists on the same page, and have each work independently; the tab switcher will maintain the relations to the appropriate tab lists in its internal map. This can be used for a detailed drill-down display, or any other point at which a tab list could be nested within another tab.
The styling of tabs can also be enhanced, to make judicious use of rounded tabs, colouring and the like; since the styling has been separated from the presentational HTML, restyling the tabs is merely a matter of changing the CSS used to define the tab styles.
Imran Nazar <tf@oopsilon.com>, 2009
]]>The following is just such a wrapper, that I wrote to get an application running on a new host quickly. It's a hack, and functionality is missing that would otherwise be in PDO, but it covers the basics.
<?php class NotPDO { private $dbconn; function NotPDO($dsn, $user='', $pass='') { $dsnparts = explode(':', $dsn); switch($dsnparts[0]) { case 'mysql': $dsnparams = explode(';', $dsnparts[1]); foreach($dsnparams as $dsnp) { $dsnpv = explode('=', $dsnp); switch($dsnpv[0]) { case 'host': $host = $dsnpv[1]; break; case 'dbname': $dbname = $dsnpv[1]; break; } } if(isset($host) && isset($dbname)) { $this->dbconn = mysql_connect($host, $user, $pass); if(!$this->dbconn) die('NotPDO: Database connection failed.'); if(!mysql_select_db($dbname, $this->dbconn)) die('NotPDO: Could not select database.'); } else { die('NotPDO: Database not specified.'); } break; default: die('NotPDO: Database type not supported.'); } } function prepare($q) { return new NotPDOQuery($q, $this->dbconn); } function query($q) { $q = new NotPDOQuery($q, $this->dbconn); $q->execute(); return $q; } }; class NotPDOQuery { private $dbconn; private $q; private $r; function NotPDOQuery($query, $dbconn) { $this->dbconn = $dbconn; $this->q = $query; } function bindParam($param, $val) { if(is_numeric($val)) { $this->q = str_replace( $param, mysql_real_escape_string($val, $this->dbconn), $this->q); } else { $this->q = str_replace( $param, "'".mysql_real_escape_string($val, $this->dbconn)."'", $this->q); } } function execute() { $this->r = mysql_query($this->q); if($this->r) return true; else return false; } function fetch() { return mysql_fetch_assoc($this->r); } function fetchAll() { $arr = array(); if(mysql_num_rows($this->r)) mysql_data_seek($this->r, 0); while($row = mysql_fetch_assoc($this->r)) $arr[] = $row; return $arr; } }; ?>]]>
Node ID | Name | Left MPTT value | Right MPTT value |
---|---|---|---|
1 | (Root) | 1 | 16 |
2 | Articles | 2 | 11 |
5 | Fiction | 3 | 8 |
7 | Fantasy | 4 | 5 |
8 | Sci-fi | 6 | 7 |
6 | Reference | 9 | 10 |
3 | Portfolio | 12 | 13 |
4 | Contact | 14 | 15 |
SELECT * from Pages order by mpttLeft asc
Numbers are assigned such that a path can be traced around the tree, taking in every node. The path here starts at (Root), flowing down the left, around the bottom of the Fiction subtree, then up to the Reference branch of Articles, and from there to the other branches of (Root), before flowing back to (Root).
Note that leaf nodes (those with no children) have Left and Right values immediately after each other; Portfolio, for example, is 12/13. Note also that the parent of a node has a smaller Left and a bigger Right; this can be used to trace up the tree finding parent nodes, until you hit a Left of 1 (meaning the root node). For example, Fantasy (4/5) has a parent of Fiction (3/8), which has a parent of Articles (2/11).
nasm
format, and assembles to a
DOS .com executable of 240 bytes.
The program is based on Horst Schäffer's 784-byte CRC32 calculator, which is part of the PBATS32 collection.
[bits 16] org 0x0100 start: mov di,CMDBUF mov cx,0x7800 xor ax,ax rep stosw mov di,LUTBUF xor cx,cx .lutolp: xor dx,dx mov ax,cx mov ch,8 .lutlp: shr dx,1 rcr ax,1 jnc .xorskip xor dx,0xEDB8 xor ax,0x8320 .xorskip: dec ch jnz .lutlp stosw mov ax,dx stosw inc cx and ch,ch jz .lutolp mov si,0x81 mov di,CMDBUF mov dx,di .getarg: lodsb cmp al,' ' je .getarg jb .aend cmp al,',' je .aend cmp al,'/' je .aend stosb jmp short .getarg .aend: cmp dx,di je printhelp push di mov ax,0x3D40 int 0x21 jc printhelp mov bp,ax mov di,LUTBUF xor ax,ax dec ax cwd .crcolp: push ax push dx mov si,READBUF mov dx,si mov cx,READBLEN mov bx,bp mov ah,0x3F int 0x21 mov cx,ax pop dx pop ax jc .crcend jcxz .crcdone .crclp: mov bx,ax lodsb xor bx,ax shl bx,2 mov al,ah mov ah,dl mov dl,dh xor dh,dh xor ax,[bx+di] xor dx,[bx+di+2] loop .crclp cmp si,READBUF+READBLEN jnc .crcolp .crcdone: not ax not dx .crcend: push ax mov bx,bp mov ah,0x3e int 0x21 pop bx pop di mov al,' ' stosb mov cx,4 .prn: mov al,dh mov dh,dl mov dl,bh mov bh,bl call byteascii loop .prn mov ax,0x0a0d stosw mov al,'$' stosb mov dx,CMDBUF jmp short printmsg printhelp: mov dx,msghelp printmsg: mov ah,9 int 0x21 ret byteascii: xor ah,ah div byte [divider] call .inner xchg ah,al .inner: cmp al,10 sbb al,0x69 das stosb ret divider: db 16 msghelp: db "Usage: CRC32 <file>",13,10,'$' CMDBUF equ 0x03C3 LUTBUF equ 0x0468 READBUF equ 0x0968 READBLEN equ 0xC000]]>
Below is a table of countries, with their ISO and UN codes, assigned top-level Internet domains and timezone relative to UTC. In all cases, there may be gaps where the ISO or UN codes, or Internet domains, have not yet been assigned. Dependent territories are provided as entries under the country to which responsibility is recognised; entities which no longer exist are shown in grey.
Caveats apply especially in the cases of Serbia, Montenegro and Kosovo, where the codes may have changed since this list was produced.
This list is also available in SQL format, where the following fields are defined:
country_id
: Sequential from 1 to 250;name
: Alpha characters, ASCII;iso2
: ISO alpha-2 code if available;iso3
: ISO alpha-3 code if available;un3
: UN number if available;sovereign
: Flag (0 if dependent territory);extant
: Flag (0 if no longer existing);parent
: Set to parent's country_id
if dependent;cctld
: Two-character Internet domain;time_offset
: Number of hours from UTC.http://imrannazar.com/content/files/countries.sql
Country | ISO alpha-2 | ISO alpha-3 | UN number | Internet TLD | Timezone |
---|---|---|---|---|---|
Afghanistan | AF | AFG | 004 | af | +4.50 |
Albania | AL | ALB | 008 | al | +1 |
Algeria | DZ | DZA | 012 | dz | +1 |
Andorra | AD | AND | 020 | ad | +1 |
Angola | AO | AGO | 024 | ao | +1 |
Antigua and Barbuda | AG | ATG | 028 | ag | -4 |
Argentina | AR | ARG | 032 | ar | -3 |
Armenia | AM | ARM | 051 | am | +4 |
Australia | AU | AUS | 036 | au | +10 |
Christmas Island | cx | +7 | |||
Cocos (Keeling) Island | cc | +6.50 | |||
Heard and McDonald Islands | hm | +5 | |||
Norfolk Island | NF | NFK | 574 | nf | +11.50 |
Austria | AT | AUT | 040 | at | +1 |
Azerbaijan | AZ | AZE | 031 | az | +4 |
Bahamas | BS | BHS | 044 | bs | -5 |
Bahrain | BH | BHR | 048 | bh | +3 |
Bangladesh | BD | BGD | 050 | bd | +6 |
Barbados | BB | BRB | 052 | bb | -4 |
Belarus | BY | BLR | 112 | by | +2 |
Belgium | BE | BEL | 056 | be | +1 |
Belize | BZ | BLZ | 084 | bz | -6 |
Benin | BJ | BEN | 204 | bj | +1 |
Bhutan | BT | BTN | 064 | bt | +6 |
Bolivia | BO | BOL | 068 | bo | -4 |
Bosnia and Herzegovina | BA | BIH | 070 | ba | +1 |
Botswana | BW | BWA | 072 | bw | +2 |
Brazil | BR | BRA | 076 | br | -3 |
Brunei Darussalam | BN | BRN | 096 | bn | +8 |
Bulgaria | BG | BGR | 100 | bg | +2 |
Burkina Faso | BF | BFA | 854 | bf | +0 |
Burundi | BI | BDI | 108 | bi | +2 |
Cambodia | KH | KHM | 116 | kh | +7 |
Cameroon | CM | CMR | 120 | cm | +1 |
Canada | CA | CAN | 124 | ca | -5 |
Cape Verde | CV | CPV | 132 | cv | -1 |
Central African Republic | CF | CAF | 140 | cf | +1 |
Chad | TD | TCD | 148 | td | +1 |
Chile | CL | CHL | 152 | cl | -4 |
China | CN | CHN | 156 | cn | +8 |
Hong Kong | HK | HKG | 344 | hk | +8 |
Macau | MO | MAC | 446 | mo | +8 |
Colombia | CO | COL | 170 | co | -5 |
Comoros | KM | COM | 174 | km | +3 |
Congo, Democratic Republic of | CD | COD | 180 | cd | +1 |
Congo, Republic of | CG | COG | 178 | cg | +1 |
Costa Rica | CR | CRI | 188 | cr | -6 |
Cote d'Ivoire | CI | CIV | 384 | ci | +0 |
Croatia | HR | HRV | 191 | hr | +1 |
Cuba | CU | CUB | 192 | cu | -5 |
Cyprus | CY | CYP | 196 | cy | +2 |
Czech Republic | CZ | CZE | 203 | cz | +1 |
Denmark | DK | DNK | 208 | dk | +1 |
Faroe Islands | FO | FRO | 234 | fo | +0 |
Greenland | GL | GRL | 304 | gl | -3 |
Djibouti | DJ | DJI | 262 | dj | +3 |
Dominica | DM | DMA | 212 | dm | -4 |
Dominican Republic | DO | DOM | 214 | do | -4 |
Ecuador | EC | ECU | 218 | ec | -5 |
Egypt | EG | EGY | 818 | eg | +2 |
El Salvador | SV | SLV | 222 | sv | -6 |
Equatorial Guinea | GQ | GNQ | 226 | gq | +1 |
Eritrea | ER | ERI | 232 | er | +3 |
Estonia | EE | EST | 233 | ee | +2 |
Ethiopia | ET | ETH | 230 | et | +3 |
Fiji | FJ | FJI | 242 | fj | +12 |
Finland | FI | FIN | 246 | fi | +2 |
Aland | AX | ALA | 248 | +2 | |
France | FR | FRA | 250 | fr | +1 |
French Guiana | GF | GUF | 254 | gf | -3 |
French Polynesia | PF | PYF | 258 | pf | -10 |
French Southern Territories | tf | +5 | |||
Guadeloupe | GP | GLP | 312 | gp | -4 |
Martinique | MQ | MTQ | 474 | mq | -4 |
Mayotte | YT | MYT | 175 | yt | +3 |
New Caledonia | NC | NCL | 540 | nc | +11 |
Reunion | RE | REU | 638 | re | +4 |
Saint Pierre and Miquelon | PM | SPM | 666 | pm | +0 |
Wallis and Futuna Islands | WF | WLF | 876 | wf | +12 |
Gabon | GA | GAB | 266 | ga | +1 |
Gambia | GM | GMB | 270 | gm | +0 |
Georgia | GE | GEO | 268 | ge | +4 |
Germany | DE | DEU | 276 | de | +1 |
Ghana | GH | GHA | 288 | gh | +0 |
Greece | GR | GRC | 300 | gr | +2 |
Grenada | GD | GRD | 308 | gd | -4 |
Guatemala | GT | GTM | 320 | gt | -6 |
Guinea | GN | GIN | 324 | gn | +0 |
Guinea-Bissau | GW | GNB | 624 | gw | +0 |
Guyana | GY | GUY | 328 | gy | -4 |
Haiti | HT | HTI | 332 | ht | -5 |
Honduras | HN | HND | 340 | hn | -6 |
Hungary | HU | HUN | 348 | hu | +1 |
Iceland | IS | ISL | 352 | is | +0 |
India | IN | IND | 356 | in | +5.50 |
Indonesia | ID | IDN | 360 | id | +7 |
Iran | IR | IRN | 364 | ir | +3.50 |
Iraq | IQ | IRQ | 368 | iq | +3 |
Ireland | IE | IRL | 372 | ie | +0 |
Israel | IL | ISR | 376 | il | +2 |
Italy | IT | ITA | 380 | it | +1 |
Jamaica | JM | JAM | 388 | jm | -5 |
Japan | JP | JPN | 392 | jp | +9 |
Jordan | JO | JOR | 400 | jo | +0 |
Kazakhstan | KZ | KAZ | 398 | kz | +6 |
Kenya | KE | KEN | 404 | ke | +3 |
Kiribati | KI | KIR | 296 | ki | +12 |
Korea, North | KP | PRK | 408 | kp | +9 |
Korea, South | KR | KOR | 410 | kr | +9 |
Kosovo | +1 | ||||
Kuwait | KW | KWT | 414 | kw | +3 |
Kyrgyzstan | KG | KGZ | 417 | kg | +6 |
Laos | LA | LAO | 418 | la | +7 |
Latvia | LV | LVA | 428 | lv | +2 |
Lebanon | LB | LBN | 422 | lb | +0 |
Lesotho | LS | LSO | 426 | ls | +2 |
Liberia | LR | LBR | 430 | lr | +0 |
Libya | LY | LBY | 434 | ly | +2 |
Liechtenstein | LI | LIE | 438 | li | +1 |
Lithuania | LT | LTU | 440 | lt | +2 |
Luxembourg | LU | LUX | 442 | lu | +1 |
Macedonia | MK | MKD | 807 | mk | +1 |
Madagascar | MG | MDG | 450 | mg | +3 |
Malawi | MW | MWI | 454 | mw | +2 |
Malaysia | MY | MYS | 458 | my | +8 |
Maldives | MV | MDV | 462 | mv | +5 |
Mali | ML | MLI | 466 | ml | +0 |
Malta | MT | MLT | 470 | mt | +1 |
Marshall Islands | MH | MHL | 584 | mh | +12 |
Mauritania | MR | MRT | 478 | mr | +0 |
Mauritius | MU | MUS | 480 | mu | +4 |
Mexico | MX | MEX | 484 | mx | -6 |
Micronesia | FM | FSM | 583 | fm | +10 |
Moldova | MD | MDA | 498 | md | +2 |
Monaco | MC | MCO | 492 | mc | +1 |
Mongolia | MN | MNG | 496 | mn | +8 |
Montenegro | me | +1 | |||
Morocco | MA | MAR | 504 | ma | +0 |
Western Sahara | EH | ESH | 732 | eh | +0 |
Mozambique | MZ | MOZ | 508 | mz | +2 |
Myanmar | MM | MMR | 104 | mm | +6.50 |
Namibia | NA | NAM | 516 | na | +2 |
Nauru | NR | NRU | 520 | nr | +12 |
Nepal | NP | NPL | 524 | np | +5.75 |
Netherlands | NL | NLD | 528 | nl | +1 |
Aruba | AW | ABW | 533 | aw | -4 |
Netherlands Antilles | AN | ANT | 530 | an | -4 |
New Zealand | NZ | NZL | 554 | nz | +12 |
Cook Islands | CK | COK | 184 | ck | -10 |
Niue | NU | NIU | 570 | nu | -11 |
Tokelau | tk | -10 | |||
Nicaragua | NI | NIC | 558 | ni | -6 |
Niger | NE | NER | 562 | ne | +1 |
Nigeria | NG | NGA | 566 | ng | +1 |
Norway | NO | NOR | 578 | no | +1 |
Bouvet Island | bv | +1 | |||
Svalbard and Jan Mayen Islands | SJ | SJM | 744 | sj | +1 |
Oman | OM | OMN | 512 | om | +4 |
Pakistan | PK | PAK | 586 | pk | +5 |
Palau | PW | PLW | 585 | pw | +9 |
Palestinian Territory, Occupied | PS | PSE | 275 | ps | +0 |
Panama | PA | PAN | 591 | pa | -5 |
Papua New Guinea | PG | PNG | 598 | pg | +10 |
Paraguay | PY | PRY | 600 | py | -4 |
Peru | PE | PER | 604 | pe | -5 |
Philippines | PH | PHL | 608 | ph | +8 |
Poland | PL | POL | 616 | pl | +1 |
Portugal | PT | PRT | 620 | pt | +0 |
Qatar | QA | QAT | 634 | qa | +3 |
Romania | RO | ROU | 642 | ro | +2 |
Russia | RU | RUS | 643 | ru | +3 |
Rwanda | RW | RWA | 646 | rw | +2 |
Saint Kitts and Nevis | KN | KNA | 659 | kn | -4 |
Saint Lucia | LC | LCA | 662 | lc | -4 |
Saint Vincent and the Grenadines | VC | VCT | 670 | vc | -4 |
Samoa | WS | WSM | 882 | ws | -11 |
San Marino | SM | SMR | 674 | sm | +1 |
Sao Tome and Principe | ST | STP | 678 | st | +0 |
Saudi Arabia | SA | SAU | 682 | sa | +3 |
Senegal | SN | SEN | 686 | sn | +0 |
Serbia | CS | SCG | 891 | rs | +1 |
Seychelles | SC | SYC | 690 | sc | +4 |
Sierra Leone | SL | SLE | 694 | sl | +0 |
Singapore | SG | SGP | 702 | sg | +8 |
Slovakia | SK | SVK | 703 | sk | +1 |
Slovenia | SI | SVN | 705 | si | +1 |
Solomon Islands | SB | SLB | 090 | sb | +11 |
Somalia | SO | SOM | 706 | so | +3 |
South Africa | ZA | ZAF | 710 | za | +2 |
Spain | ES | ESP | 724 | es | +1 |
Sri Lanka | LK | LKA | 144 | lk | +5.50 |
Sudan | SD | SDN | 736 | sd | +3 |
Suriname | SR | SUR | 740 | sr | +0 |
Swaziland | SZ | SWZ | 748 | sz | +2 |
Sweden | SE | SWE | 752 | se | +1 |
Switzerland | CH | CHE | 756 | ch | +1 |
Syria | SY | SYR | 760 | sy | +0 |
Taiwan | TW | TWN | 158 | tw | +8 |
Tajikistan | TJ | TJK | 762 | tj | +5 |
Tanzania | TZ | TZA | 834 | tz | +3 |
Thailand | TH | THA | 764 | th | +7 |
Timor-Leste | TL | TLS | 626 | tp | +9 |
Togo | TG | TGO | 768 | tg | +0 |
Tonga | TO | TON | 776 | to | +13 |
Trinidad and Tobago | TT | TTO | 780 | tt | -4 |
Tunisia | TN | TUN | 788 | tn | +1 |
Turkey | TR | TUR | 792 | tr | +2 |
Turkmenistan | TM | TKM | 795 | tm | +5 |
Tuvalu | TV | TUV | 798 | tv | +12 |
USSR | SU | SUN | 810 | su | +0 |
Uganda | UG | UGA | 800 | ug | +3 |
Ukraine | UA | UKR | 804 | ua | +2 |
United Arab Emirates | AE | ARE | 784 | ae | +4 |
United Kingdom | GB | GBR | 826 | uk | +0 |
Anguilla | AI | AIA | 660 | ai | -4 |
Ascension Island | ac | +0 | |||
Bermuda | BM | BMU | 060 | bm | -4 |
British Indian Ocean Territory | io | +6 | |||
British Virgin Islands | IO | VGB | 092 | vg | -4 |
Cayman Islands | KY | CYM | 136 | ky | -5 |
Falkland Islands (Malvinas) | FK | FLK | 238 | fk | -4 |
Gibraltar | GI | GIB | 292 | gi | +1 |
Guernsey | GG | GGY | 831 | gg | +0 |
Isle of Man | IM | IMN | 833 | im | +0 |
Jersey | JE | JEY | 832 | je | +0 |
Montserrat | MS | MSR | 500 | ms | -4 |
Pitcairn Island | PN | PCN | 612 | pn | -8 |
Saint Helena | SH | SHN | 654 | sh | +0 |
South Georgia and the South Sandwich Islands | gs | -2 | |||
Turks and Caicos Islands | TC | TCA | 796 | tc | -5 |
United States of America | US | USA | 840 | us | -5 |
American Samoa | AS | ASM | 016 | as | -11 |
Guam | GU | GUM | 316 | gu | +10 |
Northern Mariana Islands | MP | MNP | 580 | mp | +10 |
Puerto Rico | PR | PRI | 630 | pr | -4 |
US Minor Outlying Islands | um | -11 | |||
United States Virgin Islands | VI | VIR | 850 | vi | -4 |
Uruguay | UY | URY | 858 | uy | -3 |
Uzbekistan | UZ | UZB | 860 | uz | +5 |
Vanuatu | VU | VUT | 548 | vu | +11 |
Vatican City State (Holy See) | VA | VAT | 336 | va | +1 |
Venezuela | VE | VEN | 862 | ve | -4.50 |
Vietnam | VN | VNM | 704 | vn | +7 |
Yemen | YE | YEM | 887 | ye | +3 |
Yugoslavia | YU | YUG | 890 | yu | +0 |
Zambia | ZM | ZMB | 894 | zm | +2 |
Zimbabwe | ZW | ZWE | 716 | zw | +2 |
If you need to update more than one portion of a page at the same time, the traditional asynchronous request can cause some pain. Since the methodology only allows for the update of one portion, multiple requests have to be generated; each of these requests will have a load time, involving processing and transfer. The end result is that the application works less efficiently than otherwise may be possible.
If instead, it were possible to retrieve all the updates in one request, that would allow for a quicker and more responsive application. This can be done relatively simply, by taking advantage of JavaScript Object Notation.
By way of example, let's take a football results website that wishes to display scores in more-or-less real time. By the traditional method of asynchronous requests, this could be done one of two ways:
A good way to alleviate this problem would be to send only the games which have changed score, in some easy-to-transfer encoding. JSON provides that encoding, through a few simple rules:
true
,
false
, null
, or one of the following two types.If the current list of football results is as follows:
<ul><li>Manchester United:<strongid="MAN">2</strong>-<strongid="AST">3</strong>:Aston Villa</li><li>Arsenal:<strongid="ARS">1</strong>-<strongid="EVE">1</strong>:Everton</li></ul>
A simplistic method of transferring some updates for the football results may, using JSON, generate the following response:
{"MAN":3, "EVE":2}
As can be seen in the above response, the value for MAN
is now
3, and the value for EVE
is 2. These name/value pairs can be used
to update the appropriate elements: for each pair in the response, update the
element whose id
is the name of the pair, with the new value.
The name "JavaScript Object Notation" infers some native ability of JavaScript to understand it; and indeed, it's possible for a script to simply evaluate the JSON response and refer to its contents as if it were a normal variable. A script to update the page contents in the manner described above may look similar to this:
updateScores = function(response) { r = eval('(' + response + ')'); for (k in r) document.getElementById(k).innerHTML = r[k]; };
If this function is used as the response handler by the AJAX script, it will be able to parse the JSON-encoded strings returned from the server. One of the advantages of using JSON is the ability to send other things than a plain response; for example, an inline script could be sent with the update:
{"MAN":3, "EVE":2, "script":"alert('There has been an update.');"}
A simple modification to the response handler will suffice to be able to use this new inline script:
updateScores = function(response) { r = eval('(' + response + ')'); for (k in r) { if(k == 'script') eval(r[k]); else document.getElementById(k).innerHTML = r[k]; } };
There are a couple of caveats involved with using this methodology:
eval
in
JavaScript, since it's possible for arbitrary code to be executed simply by
returning desired code instead of the JSON response. Some precautions have
been taken by the response handler I've set out above, but the server should
also be employed to ensure that the requesting session is valid.If these issues are kept in mind, JSON-encoded AJAX responses are a very useful tool for live Website updates, and can be built upon to generate highly efficient and responsive tools.
]]>The current generation of languages have generally eschewed pointers, and most big thinkers in programming discourage their use. As The C Programming Language states:
Pointers have been lumped with goto
statement as a marvelous way
to create impossible-to-understand programs. ... With discipline, however,
pointers can be used to achieve clarity and simplicity.
It's for this reason that the team behind the development of C# ran against the trend of removal for pointer syntax, and included pointers in the language.
unsafe
blocksC# is one of the languages in the .NET family, and as such is run under the Common Language Runtime (CLR). It's the runtime that takes care of memory operations and lower-level functionality on the program's behalf, so that the program doesn't normally have to worry about pointers or memory buffers.
The language does, however, provide a way for pieces of code to avoid the
constraints of the CLR. Any blocks marked as unsafe
will not be
managed by the runtime, and it's up to the programmer to test and ensure that
the code works as expected.
unsafe
block in a methodstatic void foo() { int i = 10; unsafe { int *p = &i; System.Console.WriteLine("Value at p: " + *p); System.Console.WriteLine("Address of p: " + (int)p); } }
As can be seen in this example, pointer dereferencing and casts are both
allowed to happen inside an unsafe
block, such that pointer
operations within these blocks proceed much as they would under C.
One of the first things a budding coder wants to learn is how to make a game, and the first stage in that journey is how to output graphics to the screen. Among the simplest demonstrations of graphical output are the two-dimensional colour effects: palette rotation, plasma, and the effect being explored here, fire.
The effect relies on the screen being represented as a memory buffer, running left-to-right and top-down; accessing consecutive memory locations will run through the whole buffer in sequence. An averaging algorithm is applied across the buffer, which runs as follows:
// Introduce randomness to the averagingFor each X-coordinate on the bottom line Fill in the pixel with a random value Next// Apply averaging filterFor all other lines in the buffer For each X-coordinate on the line Total = Value of current pixel + Value of pixel to the right + Value of pixel underneath + Value of pixel to the bottom left Avg = Total / 4// Decrement, so lines toward the top fade to 0Avg = Avg-1 If (Avg < 0) Avg = 0 Value of current pixel = Avg Next Next
The effect of this is that the larger values, from the line beneath, are transferred in lesser form up the screen; combined with a forced decrement on the averaged value, the effect is a randomised fading from high values at the bottom, to zero at the top. With the appropriate palette to define high values as white, going through yellow and red to black, it's easy to make this look like a burning fire.
It's a situation that's ideal for pointers: dereferencing a pointer to get the value at the current pixel, adding constants on to get to the pixels around the current one, and then pushing the pointer along to do the next byte.
The simplest way of using this effect to get something on the screen is to use a Windows Form; the Form base class will handle all the window instantiation and mouse events, leaving us to concentrate on putting data into the window. By holding a handle to an 8-bit bitmap image, and drawing that image into the window, the Bitmap class will do all the work of calculating palette colours and translating them from the bitmap indices.
fire.cs
: Rendering the fireusing System; using System.Drawing; using System.Drawing.Imaging; using System.Windows.Forms; class FirstForm : Form { private Bitmap buf;// Graphic bufferprivate Random rnd;// RNG sourceprivate const int width = 320; private const int height = 240; public FirstForm() {// Initialise an RNG for later usernd = new Random();// Set the initial properties of the formthis.Text = "Fire #1"; this.ClientSize = new Size(width, height); this.MaximizeBox = false; this.BackColor = Color.Black; SetStyle(ControlStyles.Opaque, true);// Nominate the paint functionthis.Paint += new PaintEventHandler(this.DoFire);// Generate a 320x240 bitmap, fire palettebuf = new Bitmap(width, height, PixelFormat.Format8bppIndexed); ColorPalette pal = buf.Palette;// Fill the palette with the following 64-colour blocks: // Black to red, Red to yellow, Yellow to white, White // Since each range is 64 colours, and RGB spans 256 values, // utilise the left shift to multiply upfor(int i=0; i<64; i++) { pal.Entries[i] = Color.FromArgb(i<<2, 0, 0); pal.Entries[i+64] = Color.FromArgb(255, i<<2, 0); pal.Entries[i+128] = Color.FromArgb(255, 255, i<<2); pal.Entries[i+192] = Color.FromArgb(255, 255, 255); } buf.Palette = pal; }// The paint function delegated to handle drawing the fireprivate void DoFire(object src, PaintEventArgs e) {// Lock the bitmap so we can write to it directBitmapData buflock = buf.LockBits( new Rectangle(Point.Empty, buf.Size), ImageLockMode.ReadWrite, PixelFormat.Format8bppIndexed);// Write a fire // This section uses pointers, and is thus deemed "unsafe"unsafe {// Fetch a pointer to the top scanline of the imageByte *bufdata = (Byte*)buflock.Scan0; Byte *bufbottom = bufdata + ((height-1) * width); Byte *i; int v;// Write a random bottom line as source of the firefor(int x=0; x<width; x++) { *(bufbottom+x) = (Byte)rnd.Next(0, 255); }// For each pixel in the image, average the values of // the pixel, the one to the right, the one underneath // and the one to the bottom left. Threshold to 0, // and write to the current position.for(i=bufdata; i<bufbottom; i++) { v = *i + *(i+1) + *(i+height) + *(i+height-1); v /= 4; if(v<0) v=0; *i = (Byte)v; } }// Unlock ourselves out from the image and blit it to the Formbuf.UnlockBits(buflock); e.Graphics.DrawImageUnscaled(buf, 0, 0);// Ensure that we'll be drawing another frame real soon, by // forcing a repaintthis.Invalidate(); } public static void Main(string[] args) { Application.Run(new FirstForm()); } }
In the full example above, the unsafe
block in the paint
handler is where the averaging algorithm is applied across the bitmap buffer,
through the use of pointers. By running this code for a few seconds, the
following is produced on-screen.
So that's how it works. Some languages in the current crop forbid the use of pointers altogether: C# allows their use, but only if you promise to keep things clean, because the runtime won't do it for you.
Copyright Imran Nazar <tf@imrannazar.com>, 2008.
]]>At this point, it's simple to work the version control: the developer commits from his local working copy, and updates the "live" working copy to the new revision. There are, of course, a couple of drawbacks to this approach:
export
ed, and copied to the live server.
Depending on the convolutions required to connect to the live server, this
can be a tedious and/or complicated process.This article discusses the next phase of repository setup: separation of the testing and production environments, and automated updating of both.
If a common testing server is introduced into the process, and a working copy of the codebase is placed into testing, it's quite simple for a developer to test code: simply commit their working copy to the repository, and update the copy on the testing server.
As with a standard production server, however, a manual step remains: the testing copy has to be updated before it can be used. The commit process itself must somehow automatically update the testing copy if the test server is to be of any real use. Fortunately, Subversion provides a facility to perform actions as part of a commit, with the hook scripting system.
A Subversion hook is a script that is run by the Subversion server whenever
a particular thing happens. There are a few actions which can trigger a hook,
but we're only interested in the commit
hook scripts:
For the automated deployment process, the post-commit
hook will
be used to ensure that only successful commits are replicated to the testing
and/or live servers.
The post-commit
hook script is given two parameters by the
Subversion server: the path to the repository that's just been updated, and the
revision number of the update. Since the hook script is specific to a
repository, it can be customised:
The hook scripts can be in any language usable by the server, including
Bash, Python or Perl; I've used Bash for the purposes of this article, so the
above control flow would translate into the following post-commit
hook:
#!/bin/bashREPO="$1" REV="$2" TEST_SERVER="192.168.1.55"# Update the working copy on the test serverssh -l root $TEST_SERVER -t "cd /var/www && svn up"
Note in the above example that the testing environment is accessed by
ssh
, which means that the root
user on the testing
server must know the public key of the Subversion server's user account. For
example, if the Subversion repositories are accessed by WebDAV through Apache,
and the Apache process is running as user nobody
, the user
calling the post-commit
hook is nobody@SVN_SERVER
,
and a public/private ssh
key pair must be prepared for this user
and copied to the testing server.
The control flow in the above example updates the testing environment every time an update is committed to the repo. What's needed next is a method of automatically pushing committed updates to the production system, using some part of the commit action as a trigger. The ideal vector for this is the commit message: the description entered by the developer as the reason for this update.
A command is available as part of the Subversion distribution to examine
the properties of a repository: svnlook
. This can be used to
check the commit message for the latest revision, and look for a signal
inserted by the developer to indicate a request for deployment.
The deployment signal can be as simple as a block of text inside the commit
message: if the post-commit
hook detects this block of text in
the message, it will perform the deployment. As stated above, svnlook
can be used to look at the commit message:
svnlook
PROD_SERVER="172.16.16.1" if ( svnlook log -r $REV $REPO | grep "~~DEPLOY~~" ) then /usr/local/bin/svn-deploy $REPO $REV "root@${PROD_SERVER}:/var/www" fi
By asking for a specific revision using the -r
flag, we can
ensure that the revision number passed into the hook script is the one that
gets checked. Even though this number should be the latest revision in the
repo, it's best to make use of the revision number when it's given.
One way to deploy a Subversion repo is to simply keep a working copy as the production environment: in this situation, deployment is as easy as updating the production working copy. The disadvantage of this is that the Subversion control files and directories will be available in production; since these control files include the text base of the working copy, this exposes the backend code and database interfaces in plain text files.
Subversion provides a command targeted to producing a "clean" copy of the
repository: a dump of the contents, without .svn
directories
littering the structure. That command is svn export
:
svn export
svn export -r $REV "file://$REPO" /destination/path
By asking for a specific revision, as with svnlook
, we make
sure that the revision passed to the hook script is the one exported. Once the
export has occurred, this can be uploaded to the production server, or
synchronised with rsync
in whatever way is required.
With these components, we can put together the hook and deployment scripts:
post-commit
: Hook script#!/bin/bashREPO="$1" REV="$2" TEST_SERVER="192.168.1.55" PROD_SERVER="172.16.16.1"# Update the working copy on the test serverssh -l root $TEST_SERVER -t "cd /var/www && svn up"# Check for a deployment signalif ( svnlook log -r $REV $REPO | grep "~~DEPLOY~~" ) then /usr/local/bin/svn-deploy $REPO $REV "root@${PROD_SERVER}:/var/www" fi
svn-deploy
: Example deployment script#!/bin/bashREPO="$1" REV="$2" TARGET="$3"# Connect to datacentre VPNsudo pppd call datacentre nodetach sudo route add -net 172.16.16.0/24 dev ppp0# Export the reporm -rf /tmp/export svn export -r $REV "file://$REPO" /tmp/export# Synchronise with productionrsync -az -e ssh /tmp/export/* $TARGET
In this particular case, the production server is behind a VPN at the datacentre, which must be tunneled through for the deployment to occur.
Once the post-commit hook has been put into place by a repository administrator, any developer with a checked-out copy of the repo is free to commit updates; any update will cause the testing environment copy to be updated, allowing for a common testing point.
Deployment is signalled as part of the commit message for a revision, as below:
- Frontend: Checkout process: CC payment handling added - Admin: Orders: Status dropdown now autosaves on change ~~DEPLOY~~
As can be seen above, the deployment code "~~DEPLOY~~
" must be
present in the commit message for deployment to be signalled. Any files changed
as part of the commit will be saved in the new revision, before deployment; the
copy to production will include all files in the repository that have changed
since the last deployment.
There are a few ways in which the above simple scripts could be enhanced.
svnlook
, it's
possible to generate a Changelog for the repository: a list of revisions
ordered by date, showing what changes were made to the codebase at each
point. It's also possible to email the Changelog to the developers, if this
is desired.svnlook
also allows the hook script to look at which files
were modified with the commit. If a pre-determined SQL file is modified, this
can be used to signal a change in database structure, and the changes can
be applied to the production database through an ssh
connection.Each of these possible changes would introduce complexity into the automated deployment system; for now, the scripts presented here are a simple way to speed up the testing and deployment process.
Copyright Imran Nazar <tf@oopsilon.com>, 2008
]]>mail
function to send a few paragraphs
of text to an email address. This is an easy and functional way to send
status messages and other small emails, but there are disadvantages:
These problems are not just limitations of PHP's mailing routines: they are limitations of the email transport mechanism. In order to get around them, a devious scheme was standardised in the 1990s.
The MIME standard was designed to work inside the existing email transport system; as such, it doesn't need any special connection methods, and no complicated networking is required on the part of the developer. Instead, MIME allows for multi-part messages by inserting all the parts into a plain-text email, and separating the parts by a boundary.
Text outside the boundary (part #0) --BOUNDARY Part #1 --BOUNDARY Part #2 --BOUNDARY--
As can be seen in the example above, two hyphens precede all instances of the boundary, and one boundary forms the end of one part and the start of the next. The end of the last part is denoted by two hyphens after the boundary closing that part.
The basic structure outlined above allows for the separation of parts, but all it can provide is multiple plain-text messages combined into one. To allow for more complex information to be encoded, headers must be provided in association with each part.
An issue arises with the boundary structure: how is the email reading client to know which lines denote the boundary for a part, and which are simply part of the message? The client can be informed of which boundary is being used by providing a header for the message in total. Headers are often used to denote the originator of the message, the software version of the sending server, and other such information which may be pertinent to the client; the MIME boundary can be added to this.
From: "Imran Nazar" <tf@oopsilon.com> MIME-Version: 1.0 Content-type: multipart/mixed; boundary="BOUNDARY" <blank line> Message body
The Content-type
header tells the email client what kind of
data is provided in the message; the text following Content-type
is known as a MIME type. The concept of MIME types has been extended for use
beyond email, and is now commonly provided by Web and file servers in response
to a request for data.
The MIME type provided with a chunk of data can be used to identify the
data in question. There are various classes of data that have MIME types
associated with them, and subdefinitions for each class. The class and subclass
of data are given in major/minor
format; a few examples are
provided below.
Major | Minor | Full type | Data |
---|---|---|---|
Text documents | |||
text | plain | text/plain | Plain text documents |
text | html | text/html | HTML documents |
text | csv | text/csv | Comma-separated data files |
Images | |||
image | jpeg | image/jpeg | JPEG-formatted images |
image | png | image/png | PNG-formatted images |
Application-specific types | |||
application | application/pdf | Portable Document Format (PDF) | |
application | zip | application/zip | PKZIP compressed archives |
application | msword | application/msword | MS Word documents |
Types with multiple components | |||
multipart | form-data | multipart/form-data | Web forms with uploaded files |
multipart | mixed | multipart/mixed | Messages with many types of component |
As can be seen above, the multipart/mixed
MIME type tells the
email reader that each part of the message can be of a different type. Just as
with the message, each part can have a header and a body. Taking this into
account, a fuller MIME-compliant message can be built.
This is part #0. --BOUNDARY Content-type: text/html This is part #1. --BOUNDARY Content-type: text/csv id,content,date "1","This is part #2.","2008-08-10" --BOUNDARY--
We've seen how to put multiple types of message into one email, but this is not sufficient for attaching documents and other files to an email message. There are two major problems with inserting documents into an email:
Content-type
, called Content-disposition
.Content-transfer-encoding
.The Content-disposition
attached to a message part can be one
of two types: inline
, meaning this type is to be shown as part
of the email, and attachment
, which denotes a file attached for
download. If it's an attachment
, a filename
can be
provided as a parameter to the Content-disposition
header. Using
this header, we can make the CSV data file in the above example into an
attachment:
This is part #0. --BOUNDARY Content-type: text/html Content-disposition: inline This is part #1. --BOUNDARY Content-type: text/csv Content-disposition: attachment; filename="data.csv" id,content,date "1","This is part #2.","2008-08-10" --BOUNDARY--
This takes care of the first problem with attaching files to an email, but the second remains: encoding the attachment into a transferable format. There are two major encoding methods allowed by the MIME standard:
quoted-printable
: a discriminate encoding, which allows
standard text through without encoding, but translates non-standard characters
into their hexadecimal ordinal values;base64
: an indiscriminate encoding, which takes the whole
stream of data as one number, and translates it a chunk at a time, three bytes
translating into a 4-character block.The base64
encoding is generally easier to produce, since the
quoted-printable
encoding requires specialised translation tables.
With base64
, the data is broken up into 48-byte "lines", and
encoded into 64-character lines before insertion into the email.
Once an encoding has been picked, it should be provided in the header for the message part, as shown in the below example.
Content-type: image/gif Content-disposition: attachment; filename="text-icon.gif" Content-transfer-encoding: base64 R0lGODlhIAAgAKIEAISEhMbGxgAAAP///////wAAAAAAAAAAACH5BAEAAAQALAAA AAAgACAAAAOaSKoi08/BKeW6Cgyg+e7gJwICRmjOM6hs6q5kUF7o+rZ2vgkypq3A oHA4kPVoxCTROFv8lNAir5mxNa7ESorpi0a5yMg15QU7vVBzFZ1Un9jtaVeMRbuf 8OA9P9zTx4CAK358QH6BiIJSR2eFhnJhiZJbkI2Oi1Rvf5N1hI6ehYeKZZVrl6Jj bKB8q3luJwGxsrO0taUXnLkXCQA7
Now we have all the pieces of the puzzle: the ability to create an email message with multiple parts, and a way to encode and attach files to the email. It's just a matter of implementation.
With the information above, implementation is no issue. The only problem
presented is how to define the MIME type of an arbitrary attachment.
Fortunately, UNIX systems provide the file
command, which can
read any file and work out the MIME type of its contents. On a Windows server,
no such analogue exists, but it is possible to obtain file
through
Microsoft Services for Unix, or UnxUtils.
A MIME-compliant email solution is provided below, making use of this tactic and the information presented in this article.
define('MIMEMAIL_HTML', 1); define('MIMEMAIL_ATTACH', 2); define('MIMEMAIL_TEXT', 3); class MIMEMail { private $plaintext; private $output; private $headers; private $boundary; public function __construct() { $this->output = ''; $this->headers = ''; $this->boundary = md5(microtime()); $this->plaintext = 0; }// add: Add a part to the email // Parameters: type (Constant) - MIMEMAIL_TEXT, MIMEMAIL_HTML, MIMEMAIL_ATTACH // name (String) - Contents of email part if TEXT or HTML // - Attached name of file if ATTACH // value (String) - Source name of file if ATTACHpublic function add($type, $name, $value='') { switch($type) { case MIMEMAIL_TEXT: $this->plaintext = (strlen($this->output))?0:1; $this->output = "{$name}\r\n" . $this->output; break; case MIMEMAIL_HTML: $this->plaintext = 0; $this->writePartHeader($type, "text/html"); $this->output .= "{$name}\r\n"; break; case MIMEMAIL_ATTACH: $this->plaintext = 0; if(is_file($value)) {// If the file exists, get its MIME type from `file` // NOTE: This will only work on systems which provide `file`: Unix, Windows/SFU$mime = trim(exec('file -bi '.escapeshellarg($value))); if($mime) $this->writePartHeader($type, $name, $mime); else $this->writePartHeader($type, $name); $b64 = base64_encode(file_get_contents($value));// Cut up the encoded file into 64-character pieces$i = 0; while($i < strlen($b64)) { $this->output .= substr($b64, $i, 64); $this->output .= "\r\n"; $i += 64; } } break; } }// addHeader: Provide additional message headers (Cc, Bcc)public function addHeader($name, $value) { $this->headers .= "{$name}:{$value}\r\n"; }// send: Complete and send the messagepublic function send($from, $to, $subject) { $this->endMessage($from); return mail($to, $subject, $this->output, $this->headers); }// writePartHeader: Helper function to add part headersprivate function writePartHeader($type, $name, $mime='application/octet-stream') { $this->output .= "--{$this->boundary}\r\n"; switch($type) { case MIMEMAIL_HTML: $this->output .= "Content-type:{$name}; charset=\"iso8859-1\"\r\n"; break; case MIMEMAIL_ATTACH: $this->output .= "Content-type:{$mime}\r\n"; $this->output .= "Content-disposition: attachment; filename=\"{$name}\"\r\n"; $this->output .= "Content-transfer-encoding: base64\r\n"; break; } $this->output .= "\r\n"; }// endMessage: Helper function to build message headersprivate function endMessage($from) { if(!$this->plaintext) { $this->output .= "--{$this->boundary}--\r\n"; $this->headers .= "MIME-Version: 1.0\r\n"; $this->headers .= "Content-type: multipart/mixed; boundary=\"{$this->boundary}\"\r\n"; $this->headers .= "Content-length: ".strlen($this->output)."\r\n"; } $this->headers .= "From:{$from}\r\n"; $this->headers .= "X-Mailer: MIME-Mail v0.03, 20070419\r\n\r\n"; } }
include('mimemail.php'); $m = new MIMEMail();// Provide the message body$m->add(MIMEMAIL_TEXT, 'An example email message.');// Attach file 'icons/txt.gif', and call it 'text-icon.gif' in the email$m->add(MIMEMAIL_ATTACH, 'text-icon.gif', '/var/www/icons/txt.gif');// Send to the author$m->send('noreply@oopsilon.com', '"Imran Nazar" <tf@oopsilon.com>', 'Test message');
Download the script: mimemail.php
Imran Nazar <tf@oopsilon.com>, 2008
]]>This article is intended to be part 1 of a series, in which a Commodore 64 will be set up to act as a standard Unix-compatible terminal. The first step in that ambitious program is to provide a reasonable text display on the Commodore 64.
The Commodore 64 has a display resolution of 320 by 200 pixels, a resolution which will be familiar to any PC programmer who has dealt with the VGA display modes. The C64 provides two basic types of display mode: bitmapped and tiled. Each of these has the ability to display pixels in one colour against a background of another colour. In both modes, the display is broken up logically into 8x8-pixel "tiles"; the difference is in how these are handled and how graphics are drawn in the two modes.
In tiled mode, also called "text mode", the screen has a tile-resolution of 40 wide by 25 high, and the display is built in the following manner.
The tile-address buffer, often called "Screen Memory", is a 1000-byte region of memory where each byte refers to an 8x8 block on screen. In order to get the bitmap data for the display, the video circuit uses the value in screen memory as a pointer into the tile-data buffer, called "Character Memory". When the computer is first started, this memory contains the shapes of letters and numbers which can be used to draw text on the screen; for this reason, tiled mode is often referred to as "40x25 text mode".
Tiled mode allows for each 8x8 block of pixels to have a different foreground colour; any bits in the tile which are set to "1" will be drawn in the foreground colour for that tile. Just like screen memory, a 1000-byte region is set aside for the tile-colour buffer, called "Colour Memory", which provides the foreground colours for each tile. The background is the same across the whole screen, and any "0" bits will be drawn in the global background colour.
Bitmapped mode skips the tile-addressing step in the display process, opting instead for a unified buffer of bitmap data. An 8000-byte region is set aside for the display of 320x200 bits, with blocks of 8x8 still being addressed as a tile.
Just as with tiled mode, each 8x8 block can have a different foreground colour. In the case of bitmapped mode, however, a block can also have its own background colour, which will be used instead of the global background if any "0" bits are encountered in the bitmap.
So we have two options for drawing to the C64's screen. Either of these could be used for the rendering of an 80x25 text mode, but there are a few arguments against the use of tiled mode:
For these reasons, it's simpler to use bitmapped mode to draw the characters. What's required now is a readable font that can be used by the rendering system.
Most terminal systems use a mono-spaced font, primarily because it makes calculations easier regarding text placement and size. This extended text mode will be no exception: in order to fit an 80x25 text screen into a 320x200 graphical display, each character must be 4x8 pixels: in other words, each of the 8x8 tiles must be cut down the middle and a character placed in each half.
What this doesn't take into account is the need to seperate characters: if the font is made up of 4x8-pixel glyphs, each character in a line of text will be joined to the next, without any seperation. What is instead needed is a pixel of seperation between characters: this means that the font will consist of 3x7-pixel glyphs in 4x8 boxes.
On such a small scale, designing a legible font is tricky: distinguishing between zero (0) and capital O is difficult at the best of times, and the difference between one (1), small L and the vertical pipe (|) can be even more of a problem. I'm not a font designer, so I opted instead to use the font glyphs from Novaterm, a terminal program for the C64.
In order to use this font programmatically, each character has to be broken down into its constituent bits, and reconstituted as data. Because the glyphs are 4 pixels wide, the resultant data will be 4 bits wide.
With a font and a bitmap mode, we can now draw text to the bitmap. Unfortunately, it's not quite as easy as writing one character to each tile, because there are only 40 tiles' worth of space across the screen. Instead, two characters have to be put inside one tile-space. This involves shifting the bitmap values for the "left" character across, and combining them with the "right" character.
In BASIC code, this could be represented as follows, assuming that the
FONT
two-dimensional array represents the 3x7 Novaterm font:
LET CH1 = 72:REM "H"LET CH2 = 101:REM "e"FOR A = 0 TO 7 OUT(A) = (FONT(CH1)(A) * 16) + FONT(CH2)(A) NEXT A
By using a "cursor" position to keep track of where on the screen tiles must be filled, it's relatively straightforward to use the technique above for rendering text two characters at a time. The problem arises when a string of text contains an odd number of characters: not only does the renderer have to fill half a tile instead of a full tile, but the next string will start halfway through the tile in question. Because of this, the rendering function becomes more complex:
The top and bottom pieces of algorithm are extensions of the main rendering loop, and won't be covered here in much detail. Instead, I'll provide an interpretation of the insides of the main loop, in pseudo-C++.
BYTE *bitmap;// 8000-byte bitmap to render toBYTE *font;// 2048 bytes font data, 8 bytes per charchar t1, t2;// Text to render (two characters long)int X, Y;// Current cursor position// Calculate position of destination tile in the bitmap // Each tile is 8 bytes longBYTE *tile = bitmap + ((Y * 80 + X) * 8); for(int i=0; i<8; i++) {// Retrieve font data for this line of the bitmapBYTE ch1 = font[t1 * 8 + i]; BYTE ch2 = font[t2 * 8 + i];// Calculate final tile contentstile[i] = (ch1 * 16) + ch2; }
In the case of the top and bottom parts of the algorithm, either ch1
or ch2
is not used in the final tile; otherwise, the code for
these parts is as above.
There are a few things that need to be considered when taking this algorithm to the Commodore 64, in order to cope with the restrictions of the platform.
The algorithm outlined above takes two characters from the font data, and shifts the "left" one over by 4 bits before tacking it to the "right" character. This step can be eliminated by keeping a pre-shifted copy of the font data as a seperate buffer to the original, which means that building a cell is merely a matter of finding one character in the original font, and the other in the shifted font, then adding the two values.
As mentioned above, each tile in bitmap mode can maintain its own foreground and background colour. The values for each tile's colours are stored in Colour Memory, one byte for each tile: the background colour code (between 0 and 15) is stored as the lower half of the byte, and the foreground code as the upper half.
For the purposes of this article, we won't be dealing with different colours of text or other attributes, so all that's required is to initialise the Colour Memory: setting all the bytes to reflect "grey on black" allows for a simple monochromatic output.
The 6510 CPU used by the Commodore 64 doesn't have a multiply instruction, which means we can't simply "multiply by 8" to get a font-data position. Luckily, we can use a basic property of binary powers to perform the calculation:
x * 8 = x * (2 ** 3) x * 8 = x << 3
By shifting the value left, we can simulate multiplication. In this case, however, that's not quite enough. Shifting a value left pushes the left-most bits off the end of the register, discarding the higher portion of the result: we need that higher portion, so some more calculation is required:
x = 01101101b LOBYTE(x * 8) = 01101101 << 3 = [011]01101000 HIBYTE(x * 8) = [011] = 01101101 >> (8-3) LOBYTE(x * 8) = x << 3 HIBYTE(x * 8) = x >> 5
The above sample demonstrates a more general rule: a 1-byte by 1-byte multiplication will generate a 2-byte result, both parts of which can be calculated by appropriate shifting. This rule can be used by the 6510 code of the implementation.
After putting these algorithms into code, something like the following will be produced:
In the example above, the additional algorithms for handling newline characters have been added to the 80x25 display system, allowing the text to contain line breaks. This is simply a matter of moving the cursor down to the start of the next line when a newline character is encountered.
The system does not currently handle scrolling of the text buffer: if text is to be drawn below line 25, it will not appear on the display. Scrolling, and other control sequences including character colour, will be covered in part 2 of this series.
80x25.s: 6510 assembly source
ansi.font: Encoded font data
80x25.prg: Assembled binary, emulation-ready
In most instances, the information in a file is encoded as a series of
structures, grouping related information such that a programming interface
can retrieve them easily. In C and C++, a special type is set aside for
just such a reason: the struct
.
typedef struct { unsigned char r;/* Red */unsigned char g;/* Green */unsigned char b;/* Blue */unsigned char reserved; } RGBQUAD;
The above is the C representation of one palette entry in a Windows BMP.
Once this structure has been defined as a type with typedef
,
using it is very simple:
RGBQUAD palette[256]; fread(palette, sizeof(RGBQUAD), 256, file_handle); printf("Colour #0 is %02X%02X%02X.\n", palette[0].r, palette[0].g, palette[0].b);
It can be seen above that the data contained within a struct
can be accessed in much the same way as methods can be accessed within a class,
in C++ or any other object-oriented programming language. Indeed, in C++ the
keywords class
and struct
are equivalent, and mean
much the same thing.
When using languages such as PHP, a problem arises: PHP does not support a
native struct
type. Further, since PHP is a loosely-typed
language, it's not possible to read data from an encoded file format directly
into PHP and manipulate it. This can be alleviated by using PHP's class
keyword to build a class containing the structure members:
class RGBQUAD {// Byte size of the structureconst SIZE = 16;// Structure memberspublic $r; public $g; public $b; public $reserved;// Initialise members given packed datapublic function __construct($data) { if($data) { list($this->r, $this->g, $this->b, $this->reserved) = unpack('v4', $data); } } }
This representation of the structure makes use of the unpack
function, to take a string of binary data and load it into the class members.
This can be used in a similar fashion to the C representation:
$palette = array(); for($i=0; $i<256; $i++) $palette[$i] = new RGBQUAD(fread($file_handle, RGBQUAD::SIZE)); printf("Colour #0 is %02X%02X%02X.\n", $palette[0]->r, $palette[0]->g, $palette[0]->b);
This approach has distinct advantages over the C struct
type.
In particular, the PHP implementation is a class, and can contain methods
other than the simple constructor; furthermore, complex types can be contained
within the structure in a way that C cannot accomplish.
An example of complex usage of the structure pattern is in parsing of the TrueType file format. A TrueType file defines a vector or bitmap font face, and contains a series of data tables: a table of names, a table of Windows-specific font information, and tables of glyph definitions. Also contained in the file structure is a header defining the table "directory", which allows a parser to find these tables within the file.
A TrueType file begins with information about the version of the TrueType specification, followed by the table directory. This can be represented in PHP in a simple manner:
// File header. This contains the number of tables in the TTF.class ttfHeader { const SIZE = 12; public $majorVersion; public $minorVersion; public $tableCount; public $searchRange; public $entrySelector; public $rangeShift; public $tableDirectory; public function __construct($file) { $this->tableDirectory = array(); $header = fread($file, self::SIZE); list($t, $this->majorVersion, $this->minorVersion, $this->tableCount, $this->searchRange, $this->entrySelector, $this->rangeShift) = unpack('n*', $header); for($i=0; $i<$this->tableCount; $i++) { $this->tableDirectory[$i] = new ttfTableDirectoryEntry(fread($file, ttfTableDirectoryEntry::SIZE)); } } }// Table directory. Describes the location and size of a tables in the TTF.class ttfTableDirectoryEntry { const SIZE = 16; public $tag; public $checksum; public $offset; public $length; public function __construct($data) { list($t, $tag, $this->checksum, $this->offset, $this->length) = unpack('N*', $data);// Build a string tag from the numeric value$this->tag = chr(($tag>>24)&255). chr(($tag>>16)&255). chr(($tag>> 8)&255). chr(($tag>> 0)&255); } }
This example shows how it's possible for a structure to contain more
information than the sum of its members. In the case of the
ttfHeader
, the constructor can pull in the structure of an
entry in the table directory, and build its own array to represent the
directory.
All in all, the Structure pattern makes it possible for PHP to represent
structures in a similar fashion to C struct
s; it also allows
PHP to be more versatile, and represent more complex information related to
the data, which would normally have to be held outside the structure.
© Imran Nazar <tf@oopsilon.com>, 2008
]]>Anyone coming from outside the network (outside my house, in other words)
will be able to view the article without a problem: a request is made to
Oopsilon, which resolves (currently) to 87.194.101.173
, and a
connection request is made to that IP address. My firewall translates that to
the web server's internal IP of 192.168.0.1
, and maintains the
translation both ways.
From inside the network, it's another story. Oopsilon resolves to
87.194.101.173
as before, but when the firewall receives that
connection request, it sees a connection from the internal network to
the outside world, and immediately back into the network. As a result, the
firewall refuses to connect the request, and I end up unable to see my
article.
The problem inside the network is caused by Oopsilon resolving to an external IP. This happens because BIND is configured with a simple DNS zone, as follows:
IN SOA adhocbox.oopsilon.com. tf.oopsilon.com. ( 2008042701 ; Serial 28800 ; Refresh 14400 ; Retry 604800 ; Expire 86400 ) ; Minimum NS adhocbox.oopsilon.com. MX 10 oopsilon.com. oopsilon.com. IN A 87.194.101.173 adhocbox IN CNAME oopsilon.com. www IN CNAME oopsilon.com.
BIND is then told to use this zone for requests relating to the domain in question, as follows:
zone "oopsilon.com" IN { type master; file "oopsilon.zone"; allow-update { 88.192.91.15; }; notify yes; };
The configuration above states that any requests for the domain will be serviced by the zone file given. This includes requests from inside the LAN, which should resolve to the LAN address of the web server. This can be fixed by using not one, but two zone files.
For external requests, the zone file above is sufficient: serving the external IP is what these clients will expect. For internal requests, a seperate zone can be used:
IN SOA adhocbox.oopsilon.com. tf.oopsilon.com. ( 2008042701 ; Serial 28800 ; Refresh 14400 ; Retry 604800 ; Expire 86400 ) ; Minimum NS adhocbox.oopsilon.com. MX 10 oopsilon.com. oopsilon.com. IN A 192.168.0.1 adhocbox IN CNAME oopsilon.com. www IN CNAME oopsilon.com.
In order to select between the two zone files, a series of "views" can be set up in the configuration file, where each view is matched against a series of IP addresses. This is done by nesting zones inside view blocks:
view "internal" { match-clients { 192.168.0.0/24; }; zone "oopsilon.com" IN { type master; file "oopsilon.zone.int"; allow-update { none; }; notify no; }; }; view "external" { match-clients { any; }; zone "oopsilon.com" IN { type master; file "oopsilon.zone"; allow-update { 88.192.91.15; }; notify yes; }; };
In the above configuration, there are two views: internal
for
clients from the internal network (192.168.0.x
), and
external
for everyone else. A view can have any number of zones
inside, but in this case I only need one zone in each.
One this configuration has been put in place, its operation is automatic: anyone from the LAN will receive the LAN IP of the web server, and will be able to view the web site. Clients outside the network will receive an external IP, and also be able to see the web site. Everyone wins.
Copyright Imran Nazar <tf@oopsilon.com>, 2008
]]>Compression is simply the name for a set of procedures, that allow data to be packed into a smaller space, and yet allow the data to be retrieved from the compressed encoding. It's a two-way process: an input file can yield compressed output, but putting the compressed output back into the algorithm should give you a copy of the input.
The concept that makes compression possible is redundancy: the fact that most data repeats itself in some fashion. A document may use the same word many times, for example, or a picture will contain the same colour in many places. A very simple example of a redundant piece of data could be something like the following.
AAAAABBWWWWWWWWWPPPPQZMMMMVVV
In this case, the redundancy is obvious; repeated series of letters present themselves throughout the sample. An easy way to compress this would be to represent the repeated letters by the number of repeats, thus cutting down on the total length of the sample.
A5B2W9P4Q1Z1M4V3
An algorithm reading this encoded version of the sample will be able to perfectly retrieve the original data: "A" five times, "B" twice, and so on. This simple algorithm is used extensively, and is called run-length encoding (RLE): writing down how long each run of characters is. An example of a widely used standard employing RLE is the venerable PCX image format.
In Figure 1, there are many solid blocks of single colours. This image is 500 pixels wide, and 190 high; as a raw bitmap, using one byte to represent a pixel, this image would constitute 95kB of data. The PCX algorithm calculates run lengths for each line of pixels in the image, and then saves the run length for consecutive pixels of the same colour: in this way, the size of the image is reduced to 52kB.
One of the major problems with RLE is that it acts on consecutive values of data: in Figure 1, the RLE algorithm will treat each horizontal line of the image separately, whereas all the lines are the same as each other. This can be alleviated by looking at the data in the whole, and building a table of how often each value occurs in the entire data set.
Huffman encoding is a method of using this "frequency table", which denotes the frequency of occurrence for each value, and assigning each entry a code. The most frequent entries are given shorter codes, and rarer entries are relegated to receiving long codes. In computing, these codes are invariably binary codes, which can then be combined into bytes for file storage.
Using the example above, a sample Huffman encoding process may run as follows:
AAAAABBWWWWWWWWWPPPPQZMMMMVVV
Value | Frequency | Code |
---|---|---|
Q | 1 | 000000 |
Z | 1 | 000001 |
B | 2 | 00001 |
V | 3 | 0001 |
P | 4 | 001 |
M | 4 | 011 |
A | 5 | 01 |
W | 9 | 1 |
01 01 01 01 01 00001 00001 1 1 1 1 1 1 1 1 1 001 001 001 001 000000 000001 011 011 011 011 0001 0001 0001 UBù$m±
Using Huffman encoding, the data has been whittled down from 29 characters to 10 bytes. This does not include the frequency and coding table, which has to be stored with the compressed data for it to make any sense; in this example, the frequency table is larger than the compressed data, but the size of the frequency table is negligible in most cases.
It is, of course, possible to combine RLE and Huffman encoding, performing RLE first and then running the compressed result through the Huffman algorithm. This produces especially good results on simple images: Figure 1 above can be compressed from a 95kB bitmap to a 4kB file by using the GIF file format, which combines RLE, Huffman encoding and other algorithms.
The methods outlined above can be used to compress data in such a manner that it can be perfectly reproduced. Examples of this usage of compression include documents and software programs, where the loss or corruption of one value may render the file worthless.
In certain circumstances, a perfect reproduction of the data in question is not necessary: a close approximation is sufficient. Generally, these circumstances arise in multimedia applications: sounds beyond the range of human hearing need not be recorded, and subtleties of colour and gradient beyond the discernment of the human eye need not be reproduced.
A classic example of this is the MPEG Audio standard, which attempts to reduce the size of audio files by removing extraneous information regarding high-frequency sounds. The Layer-3 specification of this standard allows for various settings of removal, by which progressively more information will be removed from the audio sample.
In Figure 2 above, two waveforms are superimposed: the original song waveform in red, and a highly compressed variant overlaid in blue. The sample shown above is 1.5 seconds long; as a section in the original waveform file, this sample is stored using 160kB of data. The compressed variant shown is of the same length, occupying only 48kB of space.
This has been achieved by the MPEG Audio compression algorithm, by transposing the sound into its frequency components, and removing those components beyond the range of human hearing (above approximately 20kHz). By doing this, the resultant waveform is not significantly affected, as can be seen above, and thus the compressed sound is not perceptibly different from that of the original source.
Just as a sound file has high-frequency components that can't be discerned by the ear, a picture has high-frequency components: shades of colour that aren't different enough for the eye to distinguish, or gradients that run from black to white so quickly that there's no space for the gradient to be seen. Just as with sound, these components can be removed from a picture; this is the premise of the JPEG image format.
JPEG performs a variant of the same algorithm used in MPEG Audio, to retrieve a two-dimensional map of the frequency components contained within an image; the algorithm then proceeds to cut the components down, and recombine the image. An example of this process is shown below.
In Figure 3, an image composed of four 16x16-pixel squares is compared against the JPEG-encoded variant of the same file. A sharp change in colour or luminance is defined as an event of high visual frequency, and it is here where JPEG performs its removal. As a result, the encoded image has a lower definition to its edges, and the meeting point of the four squares is especially blurred.
The strength of JPEG is not in encoding images of sharp edges and corners, but instead in images of low visual frequency; photographs are a prime example of such.
In Figure 4, a 300x300 image of Antalya Harbour is encoded by JPEG. The original bitmap is 270kB, whereas by removal of the sharp edges and colour changes, JPEG is able to produce a 22kB image. As far as the human eye is concerned, very little has changed in the image; the features shown in the image survive intact, even if the pixels have changed somewhat.
This is the main concept behind lossy encoding: that the exact data is not as important as the information presented by the data. Using the JPEG algorithm to encode a software program would be unwise, but in cases where the information is more than the sum of the data, lossy encoding is ideal.
When it comes to video clips, it's possible to compress the data involved yet further, by combining the principles behind lossless and lossy encoding. The simplest and most naive method of building a video clip is to tack together consecutive pictures and refer to them as frames: the MJPEG video file format does this by treating a series of JPEG images as individual frames.
What this approach ignores is the inherent redundancy in a video clip: most of the information contained in a given frame is also in the previous frame. Only a small percentage of any particular frame is new information; by calculating where that percentage of information lies, and storing only that amount, it's possible to drastically cut down the data size of the frame.
In Figure 5, the second frame of video shows very little change relative to the first: only in the Shuttle's exhaust plume is there significant motion. Indeed, the output of the SRBs and the sky behind the launch tower are entirely unchanged between frames. Instead of storing these portions of the image in their entirety, it's possible to store a single value: "No change".
The MPEG Video standard makes use of this inherent redundancy as a part of its compression algorithm. In theory, only the initial frame of a shot is required in full: any movement as part of the shot can be stored as a difference from the previous frame. The initial frame, known as an Intra-frame, is stored as a standard JPEG image, and the subsquent difference frames are called inter-frames, or Predicted frames.
In practice, the MPEG Video standard was designed with "streaming" in mind: the ability to begin viewing a video clip halfway through a shot. If only one Intra-frame (I-frame) is provided for the shot, it's not possible for the Predicted frames (P-frames) to interpolate their differences. For this reason, I-frames are commonly inserted at regular intervals into the video clip, regardless of whether a shot is in progress.
In Figure 6 above, the video clip has I-frames inserted at 25-frame intervals, or once a second. The subsequent P-frames are each much smaller in size than the I-frame, since politicians tend not to move around very much when interviewed, thus causing a lower amount of difference between frames.
The example used for Figure 6 was a 400x224 video clip of 4 seconds. In raw bitmap form, the size of the resultant file would be 26.7MB; by using the combined techniques of lossy encoding and redundancy, the MPEG Video standard is able to reduce this to 300kB, a reduction of 99%.
The examples of lossy encoding presented in this article are employed in special circumstances: audio, video, pictures. It's only in these instances, and others related to these, that perception is the important factor in the compression process. For other compression targets, such as documents and software programs, it's important to preserve the data exactly as-is.
More advanced specialisations of compression are being developed all the time, but most common implementations of compression are based on the techniques in this article: eliminating redundant and duplicate information. Compression works best when there's a lot of redundant data, so don't try to compress a compressed file.
Imran Nazar <tf@oopsilon.com>, 2008
]]>May 8 01:46:43 adhocbox sshd[28514]: Invalid user tanta from 61.100.x.x May 8 01:46:46 adhocbox sshd[28516]: Invalid user cornel from 61.100.x.x May 8 01:46:49 adhocbox sshd[28518]: Invalid user ronaldo from 61.100.x.x May 8 01:46:51 adhocbox sshd[28520]: Invalid user wave from 61.100.x.x May 8 01:46:54 adhocbox sshd[28522]: Invalid user vanilla from 61.100.x.x May 8 01:46:57 adhocbox sshd[28524]: Invalid user ice from 61.100.x.x May 8 01:47:02 adhocbox sshd[28526]: Invalid user mason from 61.100.x.x
This is repeated for a few hundred lines, and is followed a few hours later by another batch of attacks from another IP. It gets tiresome for one's bandwidth to be taken up by these attempts at logins, especially when the server only has one valid user, as is the case for my setup.
I decided to put an end to this, by implementing a whitelist: allowing
specific IPs through at the firewall, and blocking all others. Fortunately,
I run an OpenWRT installation on my Internet router, which provides Linux's
iptables
infrastructure for the manipulation of firewall rules.
In this article, I'll detail how I set up my whitelist system, and how you
can do the same.
My network consists of a Linksys WRT54G wireless router hosting the firewall, and a webserver running a distribution of Linux. For the purposes of this setup, the particulars of the webserver aren't an issue, but you will need:
iptables
firewall, along with simplification
scripts for the firewall rules. Any Linux box can act as the firewall,
as long as you can pull the appropriate formatting together for the rules.OpenWRT provides a simple wrapper over the Linux iptables
interface, using awk
to rewrite the contents of a configuration
file into filtering and NAT rules, which are then applied by an init script.
There's also a wrapper on top of that, which constitutes the Web interface
to the firewall; it is this interface that most people associate with the
OpenWRT firewall.
The major issues with the Web interface are that it's relatively clunky, especially when it comes to changing the order of firewall rules; new rules are added to the bottom of the list, and moving them to the top involves an arduous series of clicks and page loads. For most purposes, direct editing of the configuration file makes more sense.
A simple configuration may contain among its rules the following:
accept:dport=113 src=192.168.0.0/24 forward:proto=tcp dport=22:192.168.0.1:22 forward:proto=tcp dport=80:192.168.0.1:80 drop
This sample script will allow the firewall to accept
Ident
requests from inside the LAN, forward
SSH and HTTP to a server
at 192.168.0.1, and drop
everything else. The parameters to each
rule are parsed out by the init script, and built into iptables
rules.
Just as with iptables
, these rules are processed in order,
and the first rule to match the incoming packet is applied. By using this
principle, it's simple to put together a ruleset which will act as a
whitelist for SSH:
forward:proto=tcp src=[IP #1] dport=22:192.168.0.1:22 forward:proto=tcp src=[IP #2] dport=22:192.168.0.1:22 drop:proto=tcp dport=22
In this example, any SSH packets coming from specific external IPs will be forwarded to the SSH server, and any other SSH packets will be dropped at the firewall. This is the behaviour which allows a whitelist: the next problem is how to add IPs to the list.
There are two ways to add addresses to this firewall ruleset. The first is to SSH into the OpenWRT router, edit the configuration file to add the appropriate rule, and then restarting the firewall service:
ssh -l root 192.168.0.254 vim /etc/config/firewall /etc/init.d/S45firewall restart
The problems with this method are two-fold:
Instead of using a manual process to update the list, it's possible to provide an externally-accessible interface to add IPs. In my case, I have a Web server (which happens to be my SSH server), so I can use a Web script to provide this interface; for the purposes of this article, PHP has been used as the language doing the work.
PHP doesn't have an interface to SSH version 2 by default: this is
provided by a PECL extension named ssh2
. Once this has been
put in place, a variety of methods are exposed to allow for SSH connections
to be made. These can be used to perform work on the OpenWRT router:
$ssh = ssh2_connect('192.168.0.254'); if(ssh2_auth_password($ssh, 'root', '[router root passwd]')) { $stream = ssh2_shell($ssh); fwrite($stream, 'touch /tmp/newfile'); }
As an aside, if you don't like having the router's root password lying around in a PHP file, the PECL ssh2 extension also provides a public key authentication mechanism, and the SSH server on an OpenWRT installation allows addition of public keys in the same manner as OpenSSH.
Opening an interactive shell with ssh2_shell
allows more
than one command to be executed, which means we can do the file
manipulation required to add an address to the list. We can combine
everything, to produce the following script.
<?php if(isset($_POST['add'])): $ssh = ssh2_connect('192.168.0.254'); if(ssh2_auth_password($ssh, 'root', '[router root passwd]')) { $fp = ssh2_shell($ssh); fwrite($fp, 'echo "forward:proto=tcp src='.$_POST['ip'].' dport=22:192.168.0.1:22" > /tmp/1'."\n"); fwrite($fp, "cp /etc/config/firewall /tmp/2\n"); fwrite($fp, "cat /tmp/1 /tmp/2 > /etc/config/firewall\n"); fwrite($fp, "/bin/sh /etc/init.d/S45firewall\n");// Provide enough time for the firewall to restartsleep(10); } echo "Done."; else: ?><form method="post"> <input type="text" name="ip"> <input type="submit" name="add" value="Add"> </form><?php endif; ?>
All that's required now is to navigate to this script, put an IP into the box, and wait 10 seconds. When this process has completed, the IP has automatically been added to the top of the firewall script, and the firewall restarted.
That should be everything you need to set up your own whitelist access list for SSH. No more brute-force attacks against your server!
Copyright Imran Nazar <tf@oopsilon.com>, 2008
]]>With this in mind, an ADO library was written for use by PHP developers many years ago, which would allow developers to directly port their existing code and interfaces to PHP. There are a few problems with this approach, as can be expected:
With the advent of PHP 5, a native database access layer was introduced to the core language: PHP Data Objects (PDO). Since this layer interfaces directly with the PHP core, it can operate on a much more efficient level, and therefore loads and runs much more quickly. Furthermore, since it's a current extension to PHP, it is maintained and kept secure.
In the ideal case, any applications using ADO under PHP would be redeveloped to use PDO. For large applications, however, this is infeasible: some kind of layer must be introduced over PDO, to "fake" the functionality of ADO on behalf of the application. The following is just such a layer.
define('ADODB_FETCH_NUM', PDO::FETCH_NUM); define('ADODB_FETCH_ASSOC', PDO::FETCH_ASSOC);]]>/** * Connection and query wrapper */class ADODB_PDO {/** PDO connection to wrap */private $_db;/** Connection information (database name is public) */private $connector; private $dsn; private $host; private $user; private $pass; public $database;/** Debug flag, publically accessible */public $debug;/** PDO demands fetchmodes on each resultset, so define a default */private $fetchmode;/** Number of rows affected by the last Execute */private $affected_rows;/** * Constructor: Initialise connector * @param connector String denoting type of database */public function __construct($connector='mysql') { $this->connector = $connector; }/** * Connect: Establish connection to a database * @param host String * @param user String [optional] * @param pass String [optional] * @param database String [optional] */public function Connect($host, $user='', $pass='', $database='') { $this->host = $host; $this->user = $user; $this->pass = $pass; $this->database = $database; switch($this->connector) { case 'mysql': $this->dsn = sprintf('%s:host=%s;dbname=%s', $this->connector, $this->host, $this->database); $this->_db = new PDO($this->dsn, $this->user, $this->pass); $this->_db->setAttribute(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, true); $this->fetchmode = ADODB_FETCH_ASSOC; break; } }/** * SetFetchMode: Change the fetch mode of future resultsets * @param fm Integer specified by constant */public function SetFetchMode($fm) { $this->fetchmode = $fm; }/** * Insert_ID: Retrieve the ID of the last insert operation * @return String containing last insert ID */public function Insert_ID() { return $this->_db->lastInsertId(); }/** * GetAll: Retrieve an array of results from a query * @param sql String query to execute * @param vars Array of variables to bind [optional] * @return Array of results */public function GetAll($sql, $vars=null) { $st = $this->DoQuery($sql, $vars); return $st?$st->fetchAll():false; }/** * CacheGetAll: Wrapper to emulate cached GetAll * @param timeout int count of seconds for cache expiry * @param sql String query to execute * @param vars Array of variables to bind [optional] * @return Array of results */public function CacheGetAll($timeout, $sql, $vars=null) { return $this->GetAll($sql, $vars); }/** * Execute: Retrieve a resultset from a query * @param sql String query to execute * @param vars Array of variables to bind [optional] * @return ADODB_PDO_ResultSet object */public function Execute($sql, $vars=null) { $st = $this->DoQuery($sql, $vars); $this->affected_rows = $st->rowCount(); return $st?new ADODB_PDO_ResultSet($st):false; }/** * CacheExecute: Wrapper to emulate cached Execute * @param timeout int count of seconds for cache expiry * @param sql String query to execute * @param vars Array of variables to bind [optional] * @return ADODB_PDO_ResultSet object */public function CacheExecute($timeout, $sql, $vars=null) { return $this->Execute($sql, $vars); }/** * Affected_Rows: Retrieve the number of rows affected by Execute * @return The number of affected rows */public function Affected_Rows() { return $this->affected_rows; }/** * GetRow: Retrieve the first row of a query result * @param sql String query to execute * @param vars Array of variables to bind [optional] * @return Array of data from first result */public function GetRow($sql, $vars=null) { $st = $this->DoQuery($sql, $vars); return $st?$st->fetch():false; }/** * GetOne: Retrieve the first value in the first row of a query * @param sql String query to execute * @param vars Array of variables to bind [optional] * @return String data of the requested value */public function GetOne($sql, $vars=null) { $st = $this->DoQuery($sql, $vars); return $st?$st->fetchColumn():false; }/** * GetAssoc: Retrieve data from a query mapped by value of first column * @param sql String query to execute * @param vars Array of variables to bind [optional] * @return Array of mapped data */public function GetAssoc($sql, $vars=null) { $out = array(); $st = $this->DoQuery($sql, $vars); if($st) { if($st->columnCount() > 2) { while($row = $st->fetch()) { $rowidx = array_shift($row); $out[$rowidx] = $row; } } else if($st->columnCount == 2) { while($row = $st->fetch()) { $rowidx = array_shift($row); $out[$rowidx] = array_shift($row); } } else $out = false; } else $out = false; return $out; }/** * GetCol: Retrieve the values of the first column of a query * @param sql String query to execute * @param vars Array of variables to bind [optional] * @return Array of column data */public function GetCol($sql, $vars=null) { $out = array(); $st = $this->DoQuery($sql, $vars); if($st) { while($val = $st->fetchColumn()) $out[] = $val; return $out; } else return false; }/** * MetaColumns: Retrieve information about a table's columns * @param table String name of table to find out about * @return Array of ADODB_PDO_FieldData objects */public function MetaColumns($table) { $out = array(); $st = $this->DoQuery('select * from '.$table); for($i=0; $i<$st->columnCount(); $i++) $out[] = new ADODB_PDO_FieldData($st->getColumnMeta($i)); return $out; }/** * qstr: Quote a string for use in database queries * @param in String parameter to quote * @return String quoted by database */public function qstr($in) { return $this->_db->quote($in); }/** * quote: Quote a string for use in database queries * @param in String parameter to quote * @return String quoted by database */public function quote($in) { return $this->_db->quote($in); }/** * DoQuery: Private helper function for Get* * @param sql String query to execute * @param vars Array of variables to bind [optional] * @return PDOStatement object of results, or false on fail */private function DoQuery($sql, $vars=null) { $st = $this->_db->prepare($sql); $st->setFetchMode($this->fetchmode); if(!is_array($vars)) $vars = array($vars); return $st->execute($vars)?$st:false; } }/** * Resultset wrapper */class ADODB_PDO_ResultSet {/** PDO resultset to wrap */private $_st;/** One-time resultset information */private $results; private $rowcount; private $cursor;/** Publically accessible row values */public $fields;/** Public end-of-resultset flag */public $EOF;/** * Constructor: Initialise resultset and first results * @param st PDOStatement object to wrap */public function __construct($st) { $this->_st = $st; $this->results = $st->fetchAll(); $this->rowcount = count($this->results); $this->cursor = 0; $this->MoveNext(); }/** * RecordCount: Retrieve number of records in this RS * @return Integer number of records */public function RecordCount() { return $this->rowcount; }/** * MoveNext: Fetch next row and check if we're at the end */public function MoveNext() { $this->fields = $this->results[$this->cursor++]; $this->EOF = ($this->cursor == $this->rowcount) ? 1 : 0; } }/** * Table field information wrapper */class ADODB_PDO_FieldData { public $name; public $max_length; public $type;/** * Constructor: Map PDO meta information to object field data * @param meta Array from PDOStatement::getColumnMeta */public function __construct($meta) { $lut = array( 'LONG' => 'int', 'VAR_STRING' => 'varchar' ); $this->name = $meta['name']; $this->max_length = $meta['len']; $this->type = $lut[$meta['native_type']]; } }/** * NewADOConnection: Thin wrapper to generate a new ADODB_PDO object * @param connector String denoting type of database * @return ADODB_PDO object */function NewADOConnection($connector) { return new ADODB_PDO($connector); }
The ideal case, of course, is for there to be no moving parts: things that move or spin inevitably have friction, and that causes noise. I've tried to eliminate everything that spins from my setup: I have a CPU that doesn't need a fan on the heatsink, and a fanless power supply. However, there's one thing left that is spinning, and that's the hard disk which hosts the Linux installation. Dropping that would make my system truly silent, so I started looking into how that could be done.
The only viable choice for a non-spinning medium to host the operating system is a USB Flash drive: most motherboard BIOSes have the ability to boot from a Flash device, and there's no other medium which is easily available in the sizes required. So the choice was obvious: copy the operating system from hard disk to Flash, and boot it from there.
Unfortunately, it's not quite that easy. In order for the BIOS to understand the Flash disk and boot from it, the disk must be formatted in a very simple format: specifically, good old FAT32. It's quite unacceptable for a Linux root filesystem to be based on FAT32, so a certain process has to be run through:
Each step will require its own tools to get the job done, and I'll be covering the details behind each of these tools when they're used. In the meantime, here's a short list of everything that will be employed:
mount
, cd
and pivot_root
, among other things. BusyBox provides all these tools in one go, as I'll explain in more detail later.The first step is to build the disk image that will eventually act as the root filesystem. This will most likely come from a system that's already running: in my case, I'll be using the media server, which runs on Gentoo. We can't compress /
as-is, since other filesystems are mounted into it, so we have to remove those mounts. A simple way of doing this is to remount /
somewhere else, by binding it with mount
:
/
to remove mount points# mkdir /root/bindmount # mount -o bind / /root/bindmount
The new bindmounted root is a plain representation of what's on the disk, with no additional mounts: that means there should be no files in /proc
or /sys, and the bare minimum in /dev
. If there are any files in these three directories, you can safely get rid of them:
# cd /root/bindmount # rm -rf proc/* dev/* sys/* # cp -a /dev/console /dev/null /dev/initctl dev
The last line above copies over the three devices which are initally needed by the boot process: the character devices for the console and null
, and the FIFO used by init
. Every other device is filled in by the kernel when udev
automatically populates /dev
.
We're still operating on the root filesystem itself; it's now safe to make a copy, to which we can make any further changes:
# mkdir /root/fscopy # cp -av /root/bindmount/* /root/fscopy
Once the copy is complete, any files in the copy's tmp
and var/tmp
directories can safely be cleaned out, since they're temporaries and won't be needed.
In an ideal world, you'd be able to take the filesystem copy and dump it to Flash, from which it would run. Unfortunately, there are two reasons as to why it's not that simple:
Fortunately, people more clever than I have devised a way of doing this: keep a compressed image on the Flash as a read-only starting point for the filesystem, and hold the changes in a RAM drive. I'll be setting up the RAM drive later, but the disk image can be done right now, using SquashFS.
There are two redeeming factors to SquashFS. Firstly, it has a hugely high compression ratio; the 2GB filesystem on my media server compresses down to about 500MB, which makes it ideal for the smaller USB Flash drives. Secondly, it's read in blocks (64k blocks by default), and the kernel will only cache a few blocks at a time; as a result, SquashFS doesn't take up much memory at all when it's running.
Setting up the image is a simple matter of calling the compressor:
# cd /root/fscopy # mksquashfs * ../filesystem.squash
Now, if you're lucky, you'll already have kernel support for SquashFS, and for UnionFS which we'll be using later. Those of you running Ubuntu Feisty, for example, will already have the modules, and don't need to do anything further. You can find out if you're one of the lucky people quite easily:
# modprobe squashfs # modprobe unionfs # mount -o loop /root/filesystem.squash /mnt
If these commands all run fine, and you can see your squashed filesystem in /mnt
, then you don't need to do any more work on the kernel. If, however, you're a masochist like myself, you'll have to compile your own kernel containing SquashFS and UnionFS. Unfortunately, these eminently useful filesystems aren't in the default kernel package, so you'll have to patch the kernel tree:
# cd /usr/src # wget http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.22.9.tar.bz2 # tar xjf linux-2.6.22.9.tar.bz2 # wget http://switch.dl.sourceforge.net/sourceforge/squashfs/squashfs3.2-r2.tar.gz # tar xzf squashfs3.2-r2.tar.gz # cd linux-2.6.22.9 # patch -p1 < ../squashfs3.2-r2/kernel-patches/linux-2.6.20/squashfs3.2-patch # wget -O- http://download.filesystems.org/unionfs/unionfs-2.1/unionfs-2.1.6_for_2.6.22.9.diff.gz | gunzip | patch -p1
Of course, these packages are current as of October '07, and will change with time. At the moment, the above lines will complete successfully.
Once the patches have been applied successfully, you can build your kernel as normal, including all the drivers and modules as normal, but also including the two patches:
File systems -> Layered filesystems -> <M> Union file system Miscellaneous filesystems -> <M> SquashFS 3.2 - Squashed file system support
Whether you've built your own kernel or using the one shipped with your distribution, the next step in the process is to get it booting. Since the USB Flash disk has a filesystem, it should be a simple matter of copying the kernel file over and applying some magic; that magic comes in the form of SysLinux, a small package which boots a kernel from a variety of situations. In order to use SysLinux, it first has to be installed; on Gentoo Linux, the installation would run as follows:
# emerge syslinux
SysLinux provides a boot system from which the Linux kernel can be started; it does this by adding a bootsector and system file to the FAT partition on which it's installed. Performing this installation couldn't be simpler:
# syslinux /dev/sda1
When SysLinux starts booting, it will look to a configuration file called syslinux.cfg
, in order to find out what to do. You can make the configuration file complex, with multiple kernels and various options between which you can select at boot-time; in this case, we need the simplest possible configuration:
syslinux.cfg
: Booting the kerneldefault kernel.img
This configuration will force SysLinux to look for a Linux kernel file, called kernel.img
, and load it. Now all that's required is to copy the kernel over: depending on which route you took above, the kernel will either be inside the kernel source tree, or sitting in /boot
. I compiled my kernel, so I ran the following:
# mount /dev/sda1 /mnt/flash # cp /usr/src/linux-2.6.22.9/arch/i386/boot/bzImage /mnt/flash/kernel.img
Now you'll be able to boot Linux from the Flash drive. Unfortunately, you won't get very far before you hit a kernel panic; no root device was specified, so the boot falls over. To allow the rest of the operating system to boot, a temporary root filesystem is required: that's what I'll build next.
Now, I could just use the SquashFS image as the root disk, and let it be. There are problems with that, though: namely that the SquashFS image is read-only, and we'll need a system that can accommodate changes to stuff like the log files. Fortunately, there is a way to allow this, which is to overlay a temporary file system on the SquashFS image, and write changes to that.
In order to set this up, we need an intermediate step between the kernel and the filesystem image, which can perform the overlay and then "pivot" over to the actual root image. The kernel has a feature which allows it to do this: the initrd
, or initial root disk, which can be loaded into RAM. The initrd can contain anything you like, as long as it's relatively small: as the whole disk image is loaded into RAM, it has to leave room for the root filesystem proper.
Building the initial root disk is a less complicated affair than putting together the SquashFS image, and it starts by allocating some space for the image. I've used 8MB, since it's ample room for everything we'll need:
# dd if=/dev/zero of=/root/initrd bs=1M count=8 # mke2fs /root/initrd # mount -o loop /root/initrd /mnt/initrd
Once the initrd has been filled in with a filesystem, and mounted somewhere, we can add stuff to it. The first things that are required are a basic directory structure, and a few device nodes:
# cd /mnt/initrd # mkdir -p bin dev etc lib/modules mnt proc sbin tmp usr/bin usr/sbin var/lib # for a in {0,1,2,3,4,5,6,7}; do mkdir mnt/$a; done # for a in {tty*,console,null}; do cp `find /dev -maxdepth 1 -name "$a" -type c` dev; done # for a in {hd*,sd*,fd*}; do cp `find /dev -maxdepth 1 -name "$a" -type b` dev; done
Once the initrd has a basic structure, it needs some executable utilities in order to do anything: commands like sh
, mount
and cp
all have to be provided. Fortunately, there's a nifty package called BusyBox which compiles all the utilities one could possibly need into one binary file. The configuration process for BusyBox is something I won't be covering in great detail, but it's very similar to the Linux kernel configuration: a system of menus is provided, from which selections can be made. Be aware that you'll have to compile BusyBox as a statically-linked binary, otherwise the initrd will require not just the executable, but the libraries on which it depends.
# wget http://busybox.net/downloads/busybox-1.7.2.tar.bz2 # tar xjf busybox-1.7.2.tar.bz2 # cd busybox-1.7.2 # make menuconfig # make
I've provided a copy of the initrd below, for those who are less willing to run through the above process. It contains the basic directory structure, along with the device nodes and a statically compiled copy of BusyBox:
http://oopsilon.com/software/linux-initrd.gz [1.1MB]
What's missing from the initrd is a copy of the modules associated with the kernel. You can retrieve these from /lib/modules
and simply copy them over to the initrd; in the example below, I'm copying the modules from the 2.6.22.9 kernel I compiled earlier:
# cp -a /lib/modules/linux-2.6.22.9 /mnt/initrd/lib/modules
The final step in the initrd is to give it a purpose: at present, it's a collection of binaries and device nodes with nothing to do. We have an initial root disk and a kernel, but no way to link the two: the kernel should load and boot the initrd. This is done by giving parameters to the kernel when it boots, and that's done from the SysLinux configuration:
syslinux.cfg
: Including the initrddefault kernel.img initrd=initrd.gz root=/dev/ram0 ramdisk_size=8192 rw init=/linuxrc
As can be seen in this configuration, the kernel is told to run a file called linuxrc
on the initrd, as the initialisation script. I've provided a simple linuxrc
on the initrd image linked above; all it does is load a shell:
linuxrc
: Sample init file#!/bin/msh mount -t proc proc /proc clear exec /bin/msh
You may have guessed that this file can contain just about anything, as long as it uses commands supplied by BusyBox. We can mount the SquashFS root filesystem, overlay a temporary RAM disk on top and start up the new root, with the following script:
linuxrc
: Pivoting to the real root#!/bin/msh echo Initial root disk loaded. Proceeding. # Mount the proc filesystem, and the Flash disk mount -t proc proc /proc mount /dev/sda1 /mnt/0 # Find the SquashFS image on the Flash disk, and mount it mount /mnt/0/newroot.sfs /mnt/1 # Mount a temporary filesystem, to use as the overlay mount -t tmpfs -o size=100M tmpfs /mnt/2 # Perform the overlay with UnionFS, with tmpfs as read/write # and the SquashFS as read-only mount -t unionfs -o dirs=/mnt/2=rw:/mnt/1=ro /mnt/1 /mnt/3 # Pivot to the new root cd /mnt/3 mkdir initrd pivot_root . initrd # Enter the new root, and run init exec chroot . /sbin/init </dev/console >/dev/console 2>&1
You may have to change the device reference for the Flash disk, depending on the kernel you use: if you boot the initrd with the simple shell-exec script I provided above, and check the output of dmesg
, you should be able to see where the kernel has loaded from.
With the new linuxrc
, the initrd is complete, and can now be unmounted and compressed:
# umount /mnt/initrd # gzip -9 /root/initrd # cp /root/initrd.gz /mnt/flash # umount /mnt/flash
And that should be that. Throw the Flash disk into a spare computer, and watch it boot: it should look just like your hard disk's boot process. If it doesn't, the cause may be one of a few things:
root
or no init
linuxrc
for the Flash disk; if the script can't find the Flash disk, things will fall over.init
fails to read /dev/initctl
init
script, as used by most Linux distributions, uses a FIFO called /dev/initctl
to communicate and change runlevels; if this node doesn't exist on the SquashFS root, init
will fail.If you have any more obscure errors, feel free to get in touch with either myself or your local Linux support channel; also, please let me know if you manage to get this setup working. The procedure above was quite smooth for me, but in the eternal clause of the technical tutorial, your mileage may vary.
Copyright Imran Nazar <tf@oopsilon.com>, 2007
]]>Update, Nov 2023: Cliff Biffle has put together a Thumb-2 opcode map for M-profile ARM cores (Google Sheet), which may be more relevant to modern interests than the ARMv4/5 opcode map; my map remains below for posterity.
The following is a full opcode map of instructions for the ARM7 and ARM9 series of CPU cores. Instructions added for ARM9 are highlighted in blue, and instructions specific to the M-extension are shown in green. The Thumb instruction set is also included, in Table 2.
Bits 27-20 |
Bits 7-4 | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
00 | AND lli | AND llr | AND lri | AND lrr | AND ari | AND arr | AND rri | AND rrr | AND lli | MUL | AND lri | STRH ptrm | AND ari | LDRD ptrm | AND rri | STRD ptrm |
01 | ANDS lli | ANDS llr | ANDS lri | ANDS lrr | ANDS ari | ANDS arr | ANDS rri | ANDS rrr | ANDS lli | MULS | ANDS lri | LDRH ptrm | ANDS ari | LDRSB ptrm | ANDS rri | LDRSH ptrm |
02 | EOR lli | EOR llr | EOR lri | EOR lrr | EOR ari | EOR arr | EOR rri | EOR rrr | EOR lli | MLA | EOR lri | STRH ptrm | EOR ari | LDRD ptrm | EOR rri | STRD ptrm |
03 | EORS lli | EORS llr | EORS lri | EORS lrr | EORS ari | EORS arr | EORS rri | EORS rrr | EORS lli | MLAS | EORS lri | LDRH ptrm | EORS ari | LDRSB ptrm | EORS rri | LDRSH ptrm |
04 | SUB lli | SUB llr | SUB lri | SUB lrr | SUB ari | SUB arr | SUB rri | SUB rrr | SUB lli | SUB lri | STRH ptim | SUB ari | LDRD ptim | SUB rri | STRD ptim | |
05 | SUBS lli | SUBS llr | SUBS lri | SUBS lrr | SUBS ari | SUBS arr | SUBS rri | SUBS rrr | SUBS lli | SUBS lri | LDRH ptim | SUBS ari | LDRSB ptim | SUBS rri | LDRSH ptim | |
06 | RSB lli | RSB llr | RSB lri | RSB lrr | RSB ari | RSB arr | RSB rri | RSB rrr | RSB lli | RSB lri | STRH ptim | RSB ari | LDRD ptim | RSB rri | STRD ptim | |
07 | RSBS lli | RSBS llr | RSBS lri | RSBS lrr | RSBS ari | RSBS arr | RSBS rri | RSBS rrr | RSBS lli | RSBS lri | LDRH ptim | RSBS ari | LDRSB ptim | RSBS rri | LDRSH ptim | |
08 | ADD lli | ADD llr | ADD lri | ADD lrr | ADD ari | ADD arr | ADD rri | ADD rrr | ADD lli | UMULL | ADD lri | STRH ptrp | ADD ari | LDRD ptrp | ADD rri | STRD ptrp |
09 | ADDS lli | ADDS llr | ADDS lri | ADDS lrr | ADDS ari | ADDS arr | ADDS rri | ADDS rrr | ADDS lli | UMULLS | ADDS lri | LDRH ptrp | ADDS ari | LDRSB ptrp | ADDS rri | LDRSH ptrp |
0A | ADC lli | ADC llr | ADC lri | ADC lrr | ADC ari | ADC arr | ADC rri | ADC rrr | ADC lli | UMLAL | ADC lri | STRH ptrp | ADC ari | LDRD ptrp | ADC rri | STRD ptrp |
0B | ADCS lli | ADCS llr | ADCS lri | ADCS lrr | ADCS ari | ADCS arr | ADCS rri | ADCS rrr | ADCS lli | UMLALS | ADCS lri | LDRH ptrp | ADCS ari | LDRSB ptrp | ADCS rri | LDRSH ptrp |
0C | SBC lli | SBC llr | SBC lri | SBC lrr | SBC ari | SBC arr | SBC rri | SBC rrr | SBC lli | SMULL | SBC lri | STRH ptip | SBC ari | LDRD ptip | SBC rri | STRD ptip |
0D | SBCS lli | SBCS llr | SBCS lri | SBCS lrr | SBCS ari | SBCS arr | SBCS rri | SBCS rrr | SBCS lli | SMULLS | SBCS lri | LDRH ptip | SBCS ari | LDRSB ptip | SBCS rri | LDRSH ptip |
0E | RSC lli | RSC llr | RSC lri | RSC lrr | RSC ari | RSC arr | RSC rri | RSC rrr | RSC lli | SMLAL | RSC lri | STRH ptip | RSC ari | LDRD ptip | RSC rri | STRD ptip |
0F | RSCS lli | RSCS llr | RSCS lri | RSCS lrr | RSCS ari | RSCS arr | RSCS rri | RSCS rrr | RSCS lli | SMLALS | RSCS lri | LDRH ptip | RSCS ari | LDRSB ptip | RSCS rri | LDRSH ptip |
10 | MRS rc | QADD | SMLABB | SWP | SMLATB | STRH ofrm | SMLABT | LDRD ofrm | SMLATT | STRD ofrm | ||||||
11 | TSTS lli | TSTS llr | TSTS lri | TSTS lrr | TSTS ari | TSTS arr | TSTS rri | TSTS rrr | TSTS lli | TSTS lri | LDRH ofrm | TSTS ari | LDRSB ofrm | TSTS rri | LDRSH ofrm | |
12 | MSR rc | BX | BLX reg | QSUB | BKPT | SMLAWB | SMULWB | STRH prrm | SMLAWT | LDRD prrm | SMULWT | STRD prrm | ||||
13 | TEQS lli | TEQS llr | TEQS lri | TEQS lrr | TEQS ari | TEQS arr | TEQS rri | TEQS rrr | TEQS lli | TEQS lri | LDRH prrm | TEQS ari | LDRSB prrm | TEQS rri | LDRSH prrm | |
14 | MRS rs | QDADD | SMLALBB | SWPB | SMLALTB | STRH ofim | SMLALBT | LDRD ofim | SMLALTT | STRD ofim | ||||||
15 | CMPS lli | CMPS llr | CMPS lri | CMPS lrr | CMPS ari | CMPS arr | CMPS rri | CMPS rrr | CMPS lli | CMPS lri | LDRH ofim | CMPS ari | LDRSB ofim | CMPS rri | LDRSH ofim | |
16 | MSR rs | CLZ | QDSUB | SMULBB | SMULTB | STRH prim | SMULBT | LDRD prim | SMULTT | STRD prim | ||||||
17 | CMNS lli | CMNS llr | CMNS lri | CMNS lrr | CMNS ari | CMNS arr | CMNS rri | CMNS rrr | CMNS lli | CMNS lri | LDRH prim | CMNS ari | LDRSB prim | CMNS rri | LDRSH prim | |
18 | ORR lli | ORR llr | ORR lri | ORR lrr | ORR ari | ORR arr | ORR rri | ORR rrr | ORR lli | ORR lri | STRH ofrp | ORR ari | LDRD ofrp | ORR rri | STRD ofrp | |
19 | ORRS lli | ORRS llr | ORRS lri | ORRS lrr | ORRS ari | ORRS arr | ORRS rri | ORRS rrr | ORRS lli | ORRS lri | LDRH ofrp | ORRS ari | LDRSB ofrp | ORRS rri | LDRSH ofrp | |
1A | MOV lli | MOV llr | MOV lri | MOV lrr | MOV ari | MOV arr | MOV rri | MOV rrr | MOV lli | MOV lri | STRH prrp | MOV ari | LDRD prrp | MOV rri | STRD prrp | |
1B | MOVS lli | MOVS llr | MOVS lri | MOVS lrr | MOVS ari | MOVS arr | MOVS rri | MOVS rrr | MOVS lli | MOVS lri | LDRH prrp | MOVS ari | LDRSB prrp | MOVS rri | LDRSH prrp | |
1C | BIC lli | BIC llr | BIC lri | BIC lrr | BIC ari | BIC arr | BIC rri | BIC rrr | BIC lli | BIC lri | STRH ofip | BIC ari | LDRD ofip | BIC rri | STRD ofip | |
1D | BICS lli | BICS llr | BICS lri | BICS lrr | BICS ari | BICS arr | BICS rri | BICS rrr | BICS lli | BICS lri | LDRH ofip | BICS ari | LDRSB ofip | BICS rri | LDRSH ofip | |
1E | MVN lli | MVN llr | MVN lri | MVN lrr | MVN ari | MVN arr | MVN rri | MVN rrr | MVN lli | MVN lri | STRH prip | MVN ari | LDRD prip | MVN rri | STRD prip | |
1F | MVNS lli | MVNS llr | MVNS lri | MVNS lrr | MVNS ari | MVNS arr | MVNS rri | MVNS rrr | MVNS lli | MVNS lri | LDRH prip | MVNS ari | LDRSB prip | MVNS rri | LDRSH prip | |
20 | AND imm | |||||||||||||||
21 | ANDS imm | |||||||||||||||
22 | EOR imm | |||||||||||||||
23 | EORS imm | |||||||||||||||
24 | SUB imm | |||||||||||||||
25 | SUBS imm | |||||||||||||||
26 | RSB imm | |||||||||||||||
27 | RSBS imm | |||||||||||||||
28 | ADD imm | |||||||||||||||
29 | ADDS imm | |||||||||||||||
2A | ADC imm | |||||||||||||||
2B | ADCS imm | |||||||||||||||
2C | SBC imm | |||||||||||||||
2D | SBCS imm | |||||||||||||||
2E | RSC imm | |||||||||||||||
2F | RSCS imm | |||||||||||||||
30 | ||||||||||||||||
31 | TSTS imm | |||||||||||||||
32 | MSR ic | |||||||||||||||
33 | TEQS imm | |||||||||||||||
34 | ||||||||||||||||
35 | CMPS imm | |||||||||||||||
36 | MSR is | |||||||||||||||
37 | CMNS imm | |||||||||||||||
38 | ORR imm | |||||||||||||||
39 | ORRS imm | |||||||||||||||
3A | MOV imm | |||||||||||||||
3B | MOVS imm | |||||||||||||||
3C | BIC imm | |||||||||||||||
3D | BICS imm | |||||||||||||||
3E | MVN imm | |||||||||||||||
3F | MVNS imm | |||||||||||||||
40 | STR ptim | |||||||||||||||
41 | LDR ptim | |||||||||||||||
42 | STRT ptim | |||||||||||||||
43 | LDRT ptim | |||||||||||||||
44 | STRB ptim | |||||||||||||||
45 | LDRB ptim | |||||||||||||||
46 | STRBT ptim | |||||||||||||||
47 | LDRBT ptim | |||||||||||||||
48 | STR ptip | |||||||||||||||
49 | LDR ptip | |||||||||||||||
4A | STRT ptip | |||||||||||||||
4B | LDRT ptip | |||||||||||||||
4C | STRB ptip | |||||||||||||||
4D | LDRB ptip | |||||||||||||||
4E | STRBT ptip | |||||||||||||||
4F | LDRBT ptip | |||||||||||||||
50 | STR ofim | |||||||||||||||
51 | LDR ofim | |||||||||||||||
52 | STR prim | |||||||||||||||
53 | LDR prim | |||||||||||||||
54 | STRB ofim | |||||||||||||||
55 | LDRB ofim | |||||||||||||||
56 | STRB prim | |||||||||||||||
57 | LDRB prim | |||||||||||||||
58 | STR ofip | |||||||||||||||
59 | LDR ofip | |||||||||||||||
5A | STR prip | |||||||||||||||
5B | LDR prip | |||||||||||||||
5C | STRB ofip | |||||||||||||||
5D | LDRB ofip | |||||||||||||||
5E | STRB prip | |||||||||||||||
5F | LDRB prip | |||||||||||||||
60 | STR ptrmll | STR ptrmlr | STR ptrmar | STR ptrmrr | STR ptrmll | STR ptrmlr | STR ptrmar | STR ptrmrr | ||||||||
61 | LDR ptrmll | LDR ptrmlr | LDR ptrmar | LDR ptrmrr | LDR ptrmll | LDR ptrmlr | LDR ptrmar | LDR ptrmrr | ||||||||
62 | STRT ptrmll | STRT ptrmlr | STRT ptrmar | STRT ptrmrr | STRT ptrmll | STRT ptrmlr | STRT ptrmar | STRT ptrmrr | ||||||||
63 | LDRT ptrmll | LDRT ptrmlr | LDRT ptrmar | LDRT ptrmrr | LDRT ptrmll | LDRT ptrmlr | LDRT ptrmar | LDRT ptrmrr | ||||||||
64 | STRB ptrmll | STRB ptrmlr | STRB ptrmar | STRB ptrmrr | STRB ptrmll | STRB ptrmlr | STRB ptrmar | STRB ptrmrr | ||||||||
65 | LDRB ptrmll | LDRB ptrmlr | LDRB ptrmar | LDRB ptrmrr | LDRB ptrmll | LDRB ptrmlr | LDRB ptrmar | LDRB ptrmrr | ||||||||
66 | STRBT ptrmll | STRBT ptrmlr | STRBT ptrmar | STRBT ptrmrr | STRBT ptrmll | STRBT ptrmlr | STRBT ptrmar | STRBT ptrmrr | ||||||||
67 | LDRBT ptrmll | LDRBT ptrmlr | LDRBT ptrmar | LDRBT ptrmrr | LDRBT ptrmll | LDRBT ptrmlr | LDRBT ptrmar | LDRBT ptrmrr | ||||||||
68 | STR ptrpll | STR ptrplr | STR ptrpar | STR ptrprr | STR ptrpll | STR ptrplr | STR ptrpar | STR ptrprr | ||||||||
69 | LDR ptrpll | LDR ptrplr | LDR ptrpar | LDR ptrprr | LDR ptrpll | LDR ptrplr | LDR ptrpar | LDR ptrprr | ||||||||
6A | STRT ptrpll | STRT ptrplr | STRT ptrpar | STRT ptrprr | STRT ptrpll | STRT ptrplr | STRT ptrpar | STRT ptrprr | ||||||||
6B | LDRT ptrpll | LDRT ptrplr | LDRT ptrpar | LDRT ptrprr | LDRT ptrpll | LDRT ptrplr | LDRT ptrpar | LDRT ptrprr | ||||||||
6C | STRB ptrpll | STRB ptrplr | STRB ptrpar | STRB ptrprr | STRB ptrpll | STRB ptrplr | STRB ptrpar | STRB ptrprr | ||||||||
6D | LDRB ptrpll | LDRB ptrplr | LDRB ptrpar | LDRB ptrprr | LDRB ptrpll | LDRB ptrplr | LDRB ptrpar | LDRB ptrprr | ||||||||
6E | STRBT ptrpll | STRBT ptrplr | STRBT ptrpar | STRBT ptrprr | STRBT ptrpll | STRBT ptrplr | STRBT ptrpar | STRBT ptrprr | ||||||||
6F | LDRBT ptrpll | LDRBT ptrplr | LDRBT ptrpar | LDRBT ptrprr | LDRBT ptrpll | LDRBT ptrplr | LDRBT ptrpar | LDRBT ptrprr | ||||||||
70 | STR ofrmll | STR ofrmlr | STR ofrmar | STR ofrmrr | STR ofrmll | STR ofrmlr | STR ofrmar | STR ofrmrr | ||||||||
71 | LDR ofrmll | LDR ofrmlr | LDR ofrmar | LDR ofrmrr | LDR ofrmll | LDR ofrmlr | LDR ofrmar | LDR ofrmrr | ||||||||
72 | STR prrmll | STR prrmlr | STR prrmar | STR prrmrr | STR prrmll | STR prrmlr | STR prrmar | STR prrmrr | ||||||||
73 | LDR prrmll | LDR prrmlr | LDR prrmar | LDR prrmrr | LDR prrmll | LDR prrmlr | LDR prrmar | LDR prrmrr | ||||||||
74 | STRB ofrmll | STRB ofrmlr | STRB ofrmar | STRB ofrmrr | STRB ofrmll | STRB ofrmlr | STRB ofrmar | STRB ofrmrr | ||||||||
75 | LDRB ofrmll | LDRB ofrmlr | LDRB ofrmar | LDRB ofrmrr | LDRB ofrmll | LDRB ofrmlr | LDRB ofrmar | LDRB ofrmrr | ||||||||
76 | STRB prrmll | STRB prrmlr | STRB prrmar | STRB prrmrr | STRB prrmll | STRB prrmlr | STRB prrmar | STRB prrmrr | ||||||||
77 | LDRB prrmll | LDRB prrmlr | LDRB prrmar | LDRB prrmrr | LDRB prrmll | LDRB prrmlr | LDRB prrmar | LDRB prrmrr | ||||||||
78 | STR ofrpll | STR ofrplr | STR ofrpar | STR ofrprr | STR ofrpll | STR ofrplr | STR ofrpar | STR ofrprr | ||||||||
79 | LDR ofrpll | LDR ofrplr | LDR ofrpar | LDR ofrprr | LDR ofrpll | LDR ofrplr | LDR ofrpar | LDR ofrprr | ||||||||
7A | STR prrpll | STR prrplr | STR prrpar | STR prrprr | STR prrpll | STR prrplr | STR prrpar | STR prrprr | ||||||||
7B | LDR prrpll | LDR prrplr | LDR prrpar | LDR prrprr | LDR prrpll | LDR prrplr | LDR prrpar | LDR prrprr | ||||||||
7C | STRB ofrpll | STRB ofrplr | STRB ofrpar | STRB ofrprr | STRB ofrpll | STRB ofrplr | STRB ofrpar | STRB ofrprr | ||||||||
7D | LDRB ofrpll | LDRB ofrplr | LDRB ofrpar | LDRB ofrprr | LDRB ofrpll | LDRB ofrplr | LDRB ofrpar | LDRB ofrprr | ||||||||
7E | STRB prrpll | STRB prrplr | STRB prrpar | STRB prrprr | STRB prrpll | STRB prrplr | STRB prrpar | STRB prrprr | ||||||||
7F | LDRB prrpll | LDRB prrplr | LDRB prrpar | LDRB prrprr | LDRB prrpll | LDRB prrplr | LDRB prrpar | LDRB prrprr | ||||||||
80 | STMDA | |||||||||||||||
81 | LDMDA | |||||||||||||||
82 | STMDA w | |||||||||||||||
83 | LDMDA w | |||||||||||||||
84 | STMDA u | |||||||||||||||
85 | LDMDA u | |||||||||||||||
86 | STMDA uw | |||||||||||||||
87 | LDMDA uw | |||||||||||||||
88 | STMIA | |||||||||||||||
89 | LDMIA | |||||||||||||||
8A | STMIA w | |||||||||||||||
8B | LDMIA w | |||||||||||||||
8C | STMIA u | |||||||||||||||
8D | LDMIA u | |||||||||||||||
8E | STMIA uw | |||||||||||||||
8F | LDMIA uw | |||||||||||||||
90 | STMDB | |||||||||||||||
91 | LDMDB | |||||||||||||||
92 | STMDB w | |||||||||||||||
93 | LDMDB w | |||||||||||||||
94 | STMDB u | |||||||||||||||
95 | LDMDB u | |||||||||||||||
96 | STMDB uw | |||||||||||||||
97 | LDMDB uw | |||||||||||||||
98 | STMIB | |||||||||||||||
99 | LDMIB | |||||||||||||||
9A | STMIB w | |||||||||||||||
9B | LDMIB w | |||||||||||||||
9C | STMIB u | |||||||||||||||
9D | LDMIB u | |||||||||||||||
9E | STMIB uw | |||||||||||||||
9F | LDMIB uw | |||||||||||||||
A0 | B | |||||||||||||||
A1 | ||||||||||||||||
A2 | ||||||||||||||||
A3 | ||||||||||||||||
A4 | ||||||||||||||||
A5 | ||||||||||||||||
A6 | ||||||||||||||||
A7 | ||||||||||||||||
A8 | ||||||||||||||||
A9 | ||||||||||||||||
AA | ||||||||||||||||
AB | ||||||||||||||||
AC | ||||||||||||||||
AD | ||||||||||||||||
AE | ||||||||||||||||
AF | ||||||||||||||||
B0 | BL | |||||||||||||||
B1 | ||||||||||||||||
B2 | ||||||||||||||||
B3 | ||||||||||||||||
B4 | ||||||||||||||||
B5 | ||||||||||||||||
B6 | ||||||||||||||||
B7 | ||||||||||||||||
B8 | ||||||||||||||||
B9 | ||||||||||||||||
BA | ||||||||||||||||
BB | ||||||||||||||||
BC | ||||||||||||||||
BD | ||||||||||||||||
BE | ||||||||||||||||
BF | ||||||||||||||||
C0 | STC ofm | |||||||||||||||
C1 | LDC ofm | |||||||||||||||
C2 | STC prm | |||||||||||||||
C3 | LDC prm | |||||||||||||||
C4 | STC ofm | |||||||||||||||
C5 | LDC ofm | |||||||||||||||
C6 | STC prm | |||||||||||||||
C7 | LDC prm | |||||||||||||||
C8 | STC ofp | |||||||||||||||
C9 | LDC ofp | |||||||||||||||
CA | STC prp | |||||||||||||||
CB | LDC prp | |||||||||||||||
CC | STC ofp | |||||||||||||||
CD | LDC ofp | |||||||||||||||
CE | STC prp | |||||||||||||||
CF | LDC prp | |||||||||||||||
D0 | STC unm | |||||||||||||||
D1 | LDC unm | |||||||||||||||
D2 | STC ptm | |||||||||||||||
D3 | LDC ptm | |||||||||||||||
D4 | STC unm | |||||||||||||||
D5 | LDC unm | |||||||||||||||
D6 | STC ptm | |||||||||||||||
D7 | LDC ptm | |||||||||||||||
D8 | STC unp | |||||||||||||||
D9 | LDC unp | |||||||||||||||
DA | STC ptp | |||||||||||||||
DB | LDC ptp | |||||||||||||||
DC | STC unp | |||||||||||||||
DD | LDC unp | |||||||||||||||
DE | STC ptp | |||||||||||||||
DF | LDC ptp | |||||||||||||||
E0 | CDP | MCR | CDP | MCR | CDP | MCR | CDP | MCR | CDP | MCR | CDP | MCR | CDP | MCR | CDP | MCR |
E1 | MRC | MRC | MRC | MRC | MRC | MRC | MRC | MRC | ||||||||
E2 | MCR | MCR | MCR | MCR | MCR | MCR | MCR | MCR | ||||||||
E3 | MRC | MRC | MRC | MRC | MRC | MRC | MRC | MRC | ||||||||
E4 | MCR | MCR | MCR | MCR | MCR | MCR | MCR | MCR | ||||||||
E5 | MRC | MRC | MRC | MRC | MRC | MRC | MRC | MRC | ||||||||
E6 | MCR | MCR | MCR | MCR | MCR | MCR | MCR | MCR | ||||||||
E7 | MRC | MRC | MRC | MRC | MRC | MRC | MRC | MRC | ||||||||
E8 | MCR | MCR | MCR | MCR | MCR | MCR | MCR | MCR | ||||||||
E9 | MRC | MRC | MRC | MRC | MRC | MRC | MRC | MRC | ||||||||
EA | MCR | MCR | MCR | MCR | MCR | MCR | MCR | MCR | ||||||||
EB | MRC | MRC | MRC | MRC | MRC | MRC | MRC | MRC | ||||||||
EC | MCR | MCR | MCR | MCR | MCR | MCR | MCR | MCR | ||||||||
ED | MRC | MRC | MRC | MRC | MRC | MRC | MRC | MRC | ||||||||
EE | MCR | MCR | MCR | MCR | MCR | MCR | MCR | MCR | ||||||||
EF | MRC | MRC | MRC | MRC | MRC | MRC | MRC | MRC | ||||||||
F0 | SWI | |||||||||||||||
F1 | ||||||||||||||||
F2 | ||||||||||||||||
F3 | ||||||||||||||||
F4 | ||||||||||||||||
F5 | ||||||||||||||||
F6 | ||||||||||||||||
F7 | ||||||||||||||||
F8 | ||||||||||||||||
F9 | ||||||||||||||||
FA | ||||||||||||||||
FB | ||||||||||||||||
FC | ||||||||||||||||
FD | ||||||||||||||||
FE | ||||||||||||||||
FF |
Bits 15-12 |
Bits 11-8 | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
0 | LSL imm | LSR imm | ||||||||||||||
1 | ASR imm | ADD reg | SUB reg | ADD imm3 | SUB imm3 | |||||||||||
2 | MOV i8r0 | MOV i8r1 | MOV i8r2 | MOV i8r3 | MOV i8r4 | MOV i8r5 | MOV i8r6 | MOV i8r7 | CMP i8r0 | CMP i8r1 | CMP i8r2 | CMP i8r3 | CMP i8r4 | CMP i8r5 | CMP i8r6 | CMP i8r7 |
3 | ADD i8r0 | ADD i8r1 | ADD i8r2 | ADD i8r3 | ADD i8r4 | ADD i8r5 | ADD i8r6 | ADD i8r7 | SUB i8r0 | SUB i8r1 | SUB i8r2 | SUB i8r3 | SUB i8r4 | SUB i8r5 | SUB i8r6 | SUB i8r7 |
4 | DP g1 | DP g2 | DP g3 | DP g4 | ADDH | CMPH | MOVH | BX reg | LDRPC r0 | LDRPC r1 | LDRPC r2 | LDRPC r3 | LDRPC r4 | LDRPC r5 | LDRPC r6 | LDRPC r7 |
5 | STR reg | STRH reg | STRB reg | LDRSB reg | LDR reg | LDRH reg | LDRB reg | LDRSH reg | ||||||||
6 | STR imm5 | LDR imm5 | ||||||||||||||
7 | STRB imm5 | LDRB imm5 | ||||||||||||||
8 | STRH imm5 | LDRH imm5 | ||||||||||||||
9 | STRSP r0 | STRSP r1 | STRSP r2 | STRSP r3 | STRSP r4 | STRSP r5 | STRSP r6 | STRSP r7 | LDRSP r0 | LDRSP r1 | LDRSP r2 | LDRSP r3 | LDRSP r4 | LDRSP r5 | LDRSP r6 | LDRSP r7 |
A | ADDPC r0 | ADDPC r1 | ADDPC r2 | ADDPC r3 | ADDPC r4 | ADDPC r5 | ADDPC r6 | ADDPC r7 | ADDSP r0 | ADDSP r1 | ADDSP r2 | ADDSP r3 | ADDSP r4 | ADDSP r5 | ADDSP r6 | ADDSP r7 |
B | ADDSP imm7 | PUSH | PUSH lr | POP | POP pc | BKPT | ||||||||||
C | STMIA r0 | STMIA r1 | STMIA r2 | STMIA r3 | STMIA r4 | STMIA r5 | STMIA r6 | STMIA r7 | LDMIA r0 | LDMIA r1 | LDMIA r2 | LDMIA r3 | LDMIA r4 | LDMIA r5 | LDMIA r6 | LDMIA r7 |
D | BEQ | BNE | BCS | BCC | BMI | BPL | BVS | BVC | BHI | BLS | BGE | BLT | BGT | BLE | SWI | |
E | B | BLX off | ||||||||||||||
F | BL setup | BL off |
Bits 9-8 |
Bits 7-6 | |||
---|---|---|---|---|
0 | 1 | 2 | 3 | |
0 | AND | EOR | LSL | LSR |
1 | ASR | ADD | SUB | ROR |
2 | TST | NEG | CMP | CMN |
3 | ORR | MUL | BIC | MVN |
This page demonstrates an implementation of the classic game 'Tetris', in JavaScript. The major purpose is to demonstrate how addition to and manipulation of the DOM can be performed easily: the Tetris display consists of two hundred DIVs in a 10x20 grid, which are given different class names to represent the blocks in the well.
JavaScript also makes it easy to retrieve events from the keyboard; the 'window' object provides an 'event' subsystem, which in combination with the setting of event handlers for the keyboard allows the script to capture and process keys. This implementation of Tetris uses four keys:
You're free to take a look at the source, and explore the game for any bugs which may exist (I already know of a couple). Enjoy.
]]>There was, however, a snag to this: I wanted to use the existing Windows installation, because I'd tuned it up and installed the software I always use. I expressly didn't want a virtual disk image duplicating my Windows drive, since I didn't have the space for that. So, that was the task: running the Windows partition in a VM.
I hunted around the 'Net, and found surprisingly little information on this: the procedure I finally threw together was sourced from many disparate places. So, in one place, I've put together the steps you'll need to take in order to get a Windows partition running inside a VM.
You'll need a few tools in order to pull the information you need, and to run the finished VM:
vmplayer
, and you're good
to go.On a Gentoo Linux installation, you can get the software you need on the Linux side from the following command (on other distributions, check your associated documentation):
emerge vmware-player vmware-modules parted
VMware needs a virtual disk descriptor file, telling it how the disk is
set up and structured. So, dump the following into a file called
WindowsXP.vmdk
:
# Disk DescriptorFile version=1 CID=9428f535 parentCID=ffffffff createType="fullDevice" # Extent description RW 63 FLAT "WindowsXP.mbr" 0 RW 23579072 FLAT "/dev/hda" 63 # The Disk Data Base #DDB ddb.toolsVersion = "6530" ddb.adapterType = "ide" ddb.virtualHWVersion = "4" ddb.geometry.sectors = "63" ddb.geometry.heads = "240" ddb.geometry.cylinders = "1559"
The values highlighted in red are ones you'll need to change, depending on the characteristics of your hard disk: they describe my disk quite well.
Now you can fire up Parted against the disk you want to use. If you have Windows on one hard disk and Linux on another, use the Windows disk in the command below, otherwise just use the disk device containing the Windows partition.
#parted /dev/hdaGNU Parted 1.7.1 Using /dev/hda Welcome to GNU Parted! Type 'help' to view a list of commands. (parted)unit s(parted)unit cyl(parted)
Note the unit s
command, which tells Parted to print out its
values in terms of disk sectors: we'll be using these values in the VMDK file.
Also note the second unit
command, to provide the values in
cylinders; that allows us to fetch the disk geometry in CHS format.
The values in red are the ones we'll be using. But not so fast; before you plug the values in, we'll need to do some calculation. Instead of using the hard disk's standard boot sector, which allows you to boot Windows or Linux, we ideally want the VM to boot only Windows. We'll be doing that by telling the Windows partition to boot, ignoring the Linux boot menu, and then making a copy of that bootsector for the VM to use.
All that means we need two lines in the VMDK, as shown above: a line for the bootsector copy, starting at sector 0 and stretching for 63 sectors; and the rest of the hard disk, starting at sector 63. And that means a little calculation: the value in the VMDK for the size of the disk is 63 less than the actual disk.
23579135 - 63 = 23579072
Note that this is the value I've got in my VMDK, shown above.
I mentioned above that we'll be using a copy of the disk's bootsector for the VM, with Windows set to boot. We'll need to set that up first:
(parted)set 1 boot on(parted)quit#
Now to make a copy of the bootsector, using the infamous dd
utility:
#dd if=/dev/hda of=WindowsXP.mbr bs=512 count=6363+0 sectors in 63+0 sectors out #
Once the VMDK is set up, we need to tell VMware what exactly it'll be booting,
and which hardware to emulate. This is done by the information file, which in
this case is WindowsXP.vmx
:
config.version = "8" virtualHW.version = "4" uuid.location = "56 4d 56 4a 7b d7 4c 30-f5 80 d6 8b c4 59 aa eb" uuid.bios = "56 4d 56 4a 7b d7 4c 30-f5 80 d6 8b c4 59 aa eb" uuid.action = "create" checkpoint.vmState = "" displayName = "Windows XP Professional" annotation = "" guestinfo.vmware.product.long = "" guestinfo.vmware.product.url = "" guestOS = "winxppro" numvcpus = "1" memsize = "128" paevm = "FALSE" sched.mem.pshare.enable = "TRUE" MemAllowAutoScaleDown = "FALSE" MemTrimRate = "-1" nvram = "WindowsXP.nvram" mks.enable3d = "FALSE" vmmouse.present = "FALSE" vmmouse.fileName = "auto detect" tools.syncTime = "TRUE" tools.remindinstall = "FALSE" isolation.tools.hgfs.disable = "FALSE" isolation.tools.dnd.disable = "FALSE" isolation.tools.copy.enable = "TRUE" isolation.tools.paste.enabled = "TRUE" gui.restricted = "FALSE" ethernet0.present = "TRUE" ethernet0.connectionType = "nat" ethernet0.addressType = "generated" ethernet0.generatedAddress = "00:0c:29:59:aa:eb" ethernet0.generatedAddressOffset = "0" usb.present = "TRUE" usb.generic.autoconnect = "TRUE" sound.present = "TRUE" sound.virtualdev = "sb16" ide0:0.present = "TRUE" ide0:0.fileName = "WindowsXP.vmdk" ide0:0.mode = "independent-persistent" ide0:0.deviceType = "rawDisk" ide0:0.redo = "" ide0:0.writeThrough = "FALSE" ide0:0.startConnected = "TRUE" ide1:0.present = "TRUE" ide1:0.fileName = "/dev/cdrom" ide1:0.deviceType = "atapi-cdrom" ide1:0.writeThrough = "FALSE" ide1:0.startConnected = "TRUE" floppy0.present = "TRUE" floppy0.fileName = "/dev/fd0" floppy0.startConnected = "TRUE" serial0.present = "FALSE" serial1.present = "FALSE" parallel0.present = "FALSE"
I've highlighted a couple of values which you might want to change: the location of the CD drive in the device tree, and the amount of memory you want to allocate to the VM.
We're not done yet; the Linux side of setting up the VM is running, but we now need to tell Windows that it'll be booting into a VM. The problem is rooted in the fact that the VM's hardware is different to your physical computer; thus, we need to add a hardware profile to the Windows partition.
We can also take this opportunity to test that the Windows partition boots up immediately, without a boot menu in the way. Reboot, and if the boot process doesn't run straight to Windows, you may need to tweak the partition boot settings and recreate the bootsector copy.
Once you're booted into Windows, pull open the System properties (Control Panel -> System, or My Computer (rightclick)-> Properties):
The default setup for this Hardware Profiles display is one profile, called "Default". Click on "Copy", to create a new profile, and call that one "VMware", then move it to the top of the list with the arrow buttons. You can see the settings I use in the image above, and we'll see exactly what that means for the Windows boot process in a little while.
While you have the System Properties open, pull open the Driver Signing properties, and set the value of what action Windows should take to "Ignore"; this allows any drivers to be installed automatically if devices are picked up by Windows.
This is also a good time to set up a helper script, which will install the VMware Tools: a set of drivers for the VMware emulated devices, and some services to help the Windows VM along. This could be done after the VM is set up and running, but I had issues with that, as detailed later. I've decided to put it here in stage 3, to catch the problem before it begins.
The VMware tools are provided by VMware as a CD ISO image, buried within the VMware Workstation software. It's relatively easy to find: first of all, download Workstation from VMware, or any Linux mirror (I've given a sample mirror below):
http://ftp.snt.utwente.nl/pub/os/linux/gentoo/distfiles/VMware-workstation-5.5.3-34685.tar.gz
Once you've got the file, extract its contents (you may need third-party tools
to do this in Windows), and look for a file called windows.iso
; this
is the Tools CD image. If you feel like wasting a CD, burn the image to one, or
you can mount the image as a drive using Daemon Tools or similar software.
When you can see the contents of the image, copy the files to your hard disk,
in a directory called C:\VMTools
or something similar. Then dump
the following into a .cmd
file:
if exist C:\VMTools\ToolsHelperLock.txt msiexec -i "C:\VMTools\VMware Tools.msi" /qn if exist C:\VMTools\ToolsHelperLock.txt shutdown -r -f -t 30 del C:\VMTools\ToolsHelperLock.txt
Also, put a small message (any you like) into ToolsHelperLock.txt
in the same directory. The lock file will be used by the helper script to work out if
it needs to reboot. Once you've done this, add the script to your Start Menu's
Startup folder, and you'll be away.
Log yourself out of Windows and reboot; that should be the last time it needs to be started physically. Boot yourself into Linux; it's time to test this thing.
When you fire up the VM, it should now automatically pick up the Windows partition, and begin booting it. The presence of two hardware profiles means that Windows will ask which one you wish to use:
Pick the VMware profile, and Windows should boot, into a very minimal VGA-colour mode: it doesn't have the drivers for the "VMware SVGA" yet. At this point, the helper script should kick in, install the VMware tools, and then reboot the VM. Make sure to remove the helper script from your Startup folder upon the next reboot; even though it won't run, it'll still flash a Command-Prompt window up for a short while.
I mentioned earlier that I had a problem with my keyboard and mouse not being detected by Windows in the VM, thus bringing about the helper script to allow the VMware tools to install the requisite drivers. I did have another issue with the VM, and that was getting it to talk to the outside world.
Initially, I set the VM's network card to a "bridged" type, allowing it to
reside on the same network as the host machine; this usually works fine. In the
case of a laptop with wireless, however, it didn't so much: no communication.
After some experimentation, I resorted to the NAT method: one network that the
host computer sits on, and another between the host and VM. This also involves
a bit of iptables
trickery, so I'm putting it in as a part of this
guide.
In my case, the normal wireless network resides on 192.168.1.0/24
,
so I decided to put the virtual network at 192.168.58.0/24
. This
means the host gets an address of 192.168.58.1
on the NAT network,
and the VM's network connection gets a static IP of 192.168.58.2
;
it also means the following changes to the Linux host's configuration:
net.ipv4.ip_forward = 1
ifconfig ra0:1 192.168.1.252 iptables -t nat -A PREROUTING -i ra0 -d 192.168.1.252 -j DNAT --to-destination 192.168.58.2 iptables -t nat -A POSTROUTING -o ra0 -s 192.168.58.2 -j SNAT --to-source 192.168.1.252 iptables -A INPUT -i ra0 -d 192.168.1.252 -p tcp -j ACCEPT iptables -A INPUT -i ra0 -d 192.168.1.252 -p udp -j ACCEPT iptables -A INPUT -i ra0 -d 192.168.1.252 -p icmp -j ACCEPT
Note how the wireless interface ra0
has had a virtual interface
added, dedicated to the transfer of traffic for the virtual machine NAT. The
particular configuration files you need to change may differ depending on the
distribution; the changes above are from my Gentoo system.
After all's said and done, you should have something like this:
Let me know if you get it working, or if you want to shout at me about something.
As is documented elsewhere, Brainf*ck has eight simple operations, which may act on a data buffer of a given size. Trainf*ck extends the list of operations to eighteen, while leaving the other aspects as is. The new instructions are as follows (P refers to the current address in the data space).
#
- Open or close a file. Takes a null-terminated filename string
starting at P. If called again, closes the currently open file; no
parameter required for close.;
- Read a byte from file. Saves the byte to P.:
- Write a byte to file. Fetches the byte from P, and writes.(
- Rewind one byte. Modifies P: 0 if start of file was reached,
1 otherwise.)
- Move forward a byte. Modifies P: 0 if end of file was reached,
1 otherwise.%
- Connect to an address/port. Takes two parameters: big-endian
IPv4 address starting at P, and big-endian TCP port number starting at
P+4. If called again, closes currently open socket; no parameters
required.$
- Listen on an address/port. Takes address and port as detailed
for %. If called again, closes currently open socket; no parameters
required.@
- Accepts an incoming connection. If called again, closes the
currently accepted connection.`
- Receive a byte from the network stream. Saves the byte to P,
or zero if connection was closed.'
- Send a byte. Fetches the byte from P, and sends.The canonical "Hello World" example as written in Brainf*ck will run under Trainf*ck exactly as it does under Brainf*ck. An example implementation follows, which prints out "Hello World!".
>+++++++++[<++++++++>-]<.> +++++++[<++++>-]<+.+++++++..+++.>>>++++++++[<++++>-]<. >>>++++++++++[<+++++++++>-]<---.<<<<.+++.------.--------.>>+.
A simple example demonstrating Trainf*ck's file handling capabilities is that of finding the size of a file. The following example calculates the size of the file "abcd", and outputs it as a character code value. The caveat with this simple example is that it will wrap the counted value after 255.
>++++++++++++++++[<++++++>-]< Generate backtick in 0 [>+>+>+>+<<<<-] Copy 4 times >+>++>+++>++++ Generate "abcd" <<<#<+ Open file and init counter >[<+:>)] Count up file size <.# Output file size and close
Networking is where Trainf*ck's capabilities shine: the language provides the ability to perform networking tasks easily and efficiently. The following example is an implementation of an 'echo' daemon, which listens on port 20480 and repeats any text sent to it.
>>> Address 0x00000000 >>++++++++[<++++++++++>-] Port 0x5000 <<<<<$ Listen +[ Continual loop @ Accept a connection [`'] Loop echoing @ Close connection +] Continue endless loop
The Trainf*ck interpreter is written in C++, and has been tested working on Linux. The code has been designed with portability in mind, and as such there should be no major issues with execution on any system.
The interpreter is currently incomplete, with the functionality of keyboard input being missing. It is anticipated that this will be added at a later date.
You can obtain the interpreter and the sample files detailed above from the following address.
]]>OPEN: A display of a complex graphical shape, with numbers scrolling down one side. A phone rings. ABDUL Yeah? CHRIS (phone) We gotta talk about these test runs. ABDUL Yeah, I'm in the middle of one right now. Can't we talk a little later? CHRIS Seriously, I gotta talk to you. ABDUL (sighs) Alright, where are you? EXT: A shadowed street corner, looking up the road to Abdul's apartment. Abdul and Chris are standing behind a building, at the left of screen. CHRIS I think they're onto us. ABDUL That's the second time this week, Chris. We can go to the chemist tomorrow if the pills aren't working any more. CHRIS Nah, I saw a van a few blocks down. They're gonna move in. A black VW van, with lights off, rushes down the road, headed in the direction of Abdul's apartment. ABDUL Alright. I can take care of this. You made the backups, right? CHRIS (holding up a data tape) Yeah, last night. ABDUL Put it away already, we don't want to lose all the test runs. CUT TO: Horizontal-split screen [TOP] Abdul pulls out a PDA, and commences to tapping on the screen. [BOTTOM] Green text, typed: TEXT > barctl --pressure 0.75 [TOP] CUT TO: INT: Abdul's apartment. A mess of computer boxes and stacks of paper, with wires running across the floor. PAN TO: A meter at the right side of the door, marked "Air", delineated from 0 to 2 bar, with a needle dropping to 0.75 before stopping. [BOTTOM] Green text, typed: TEXT > sendserial 0b11110000 Connection lost [TOP] Screen whites out. CUT TO: EXT: Street corner ABDUL That's all four explosive lines triggered. I've dropped the air pressure, so the place doesn't blow out immediately. As long as they don't open the door, they'll be fine. CUT TO: INT: Corridor outside Abdul's apartment. A group of six men, dressed as a police squad, is striding down the corridor, camera tracking from in front. They arrive at a particular door, and stop. OFFICER (pounding on door) Police! Open up! The police officer considers for a second, then puts the palm of his hand against the door. OFFICER Shit, it's warm. Shouldn't we get the fire service or something? CAPTAIN Fuck the fire service, we need that data. Open the damn door! The officer joins another with a door-battering ram, and slams it against the door twice. CUT TO: EXT: Street corner. A few blocks behind Abdul and Chris, an apartment block is torn by an explosion, blowing out the windows of Abdul's place. A fireball emanates from the window, before flames start spewing. A car alarm is triggered nearer Abdul and Chris. ABDUL They never learn. Abdul and Chris get into Chris's car parked a few yards off, and drive away.]]>
13 minutes remaining
."
By all accounts, John had a simple enough job in the organisation: monitor the computers, as they went about their task. He had been told by the previous holder of the post that the software had an "issue" with a certain process; it tended to get "stuck", as he put it, near the end. He'd referred to it as the "13-minute gap", because the software always stuck when it displayed that figure as the remaining time to completion.
This was John's second day, and already he'd seen the software stick twice: once on the process last night, and again right now. Both times, it had taken a good span of time before the computers were moving again, and it seemed that John could do nothing to speed it up. He was a little worried that his job seemed redundant; the setup never went wrong, since it was able to cut out any computer units that went bad for whatever reason, and compensate for it.
At a wild guess, it was this compensation that the system had been doing for
the last half hour, since the displays still stubbornly read "13 minutes
".
John decided it was time to leave the system be, and do some homework. He was
enrolled on a part-time course with the local university, and this job had
already given him ample time for study, even in his short time in the post. He
settled down with his dense book on legal theory, and set to.
John woke up, probably a few hours later, to find the computer room much quieter. Obviously, in the intervening period taken by his nap, the task allocated to the system had been completed. One of the computers had a few sheafs of paper freshly printed in its out-tray, and he guessed he'd have to pick those up and take them to the manager. It could wait a few more hours though, John thought, and he did need more sleep.
Abdul studied the report produced by the overnights in more detail. The first time he'd read it, the result had looked a bit surprising. It seemed that the task had finally hit the right nail with the right hammer.
It had been Moassim's idea; he tended to come up with such crazy theories, most of which had no possible basis in fact. Abdul had listened to this one like most that Moassim had come out with before, with a healthy dose of speculation, but as he listened, the idea toyed with him: it seemed to gain substance, to become a possibility. So Abdul had decided to try modelling the plan, to plug the specifications into a computer and see what would happen.
The idea ran thus. The Earth's crust was a pretty robust thing, and could take small perturbations like volcano eruptions, or climate shifts, in its stride. This was only the case when these shocks were delivered in solitary, however: one large eruption or one sudden cooling was no real problem, but a prolonged series could become a problem.
That was the root of Moassim's theory: apply a large enough number of significant but relatively small shocks to the crust, and it would dissolve over a large area. Moassim saw it as the ideal strike: highly sophisticated, beyond the comprehension of the enemy. Abdul hadn't been so sure, which was why he'd decided to try it out mathematically first.
Abdul had set up a bank of computers to model the chain of detonation as per Moassim's theory, looking for specific places and times where the chain could best be maintained. Each night, a build process would run the tweaked parameters, and change them a little, heading towards the ideal point: the longest possible chain for a given amount of energy input.
Last night, the computer bank had hit that ideal point. Small amounts of antimatter, placed at the coordinates indicated by the printout and then detonated in sequence, would cause the crust of the European plate to resonate; more detonations, as detailed by the model, would mean total dissolution of the crust across the European area.
It was a vindication: the plan would work. There was just one snag: Abdul didn't know of anyone in the world who could produce antimatter in any quantity, and he was going to need a lot.
]]>The room was lit by fluorescent white lamps from the ceiling, a few of which were flickering occasionally, emitting small pops as they fired on and then went out. Shelves lined the walls, littered with sheafs of paper and well-thumbed books, on such dry subjects as "Electromagnetic Fields at the Subatomic Scale"; stacks of paper and books lay haphazardly at the foot of a few of the shelves, as if abandoned.
In the corners lay relics of what could be old pieces of machinery, or perhaps half-finished projects: A set of thin windows were set high in the walls, not designed for looking out upon the world; the night made them appear as black slots in walls of grey. Only one area of the room was relatively clean: a desk, up against one set of bookshelves, on which were eight identical boxes. Wires trailed away to power sockets, and more wires connected the boxes together. They seemed to be computers, since there was a ninth, smaller box with a screen and keyboard attached.
There was a small thump at the door to the room, and then a lock turned. The door opened, and a man stumbled in, evidently the owner of the property. He was dressed in clothes that had probably seen better days: a plain blue tee-shirt, fading black jeans. Indeed, he had probably seen better days, since his face told a story of tipsiness, confirmed by the can of beer he held in one hand. Blinking at thesudden fluorescent light, and flicked a switch to douse the room in shade; a dim ambience filtered down from the slitted windows, so he could at least see.
His boots cleared a path through the detritus on the floor, as he made his way to the desk; setting his can down on the desk, he pulled a chair from nearby and sat down, before hitting a key on the keyboard. The screen flashed to life, and he again shrank away before his eyes adjusted. On the screen was depicted some sort of process: a block on the left side was slowly transferring itself to the right, as if through a horizontal hourglass. A column of numbers scrolled down the far right, continuously updating with new values as the block changed in size. The man seemed pleased with what he saw, and took a sip from his can.
The phone rang. An instinctive jerk away from the source of the noise nearly toppled the chair, before he realised that the phone was in his pocket. He fished it out, and set to finding the Answer button.
"Yuh?" he blurted when he finally found it.
"Abdul, you sound like crap. Get online."
"Wha- who's that?"
"Just connect, yeah. I've got something you might wanna see."
And with that, the phone hung up. Abdul stuffed the phone back into his pocket and took another sip from his can, before hitting a few strokes on the keyboard. The visualisation vanished, and another area appeared: the screen went black, apart from lines of grey text near the bottom.
[Joined #physics-discuss] <Qubic> the man himself. morning <foo> ye, what, i gotta be up at 6am <Qubic> yeah, why are you still awake? what you been up to <foo> drink. now come on, why you want me here <Qubic> i found someone working on producing tau neutrinos. interested?
That made Abdul sit up. His latest project needed a cheap and easy way to produce neutrinos: sub-atomic particles that could pass through matter without affecting it, only interacting one time in a billion. From the models Abdul had been running on the computers, neutrinos would speed his process up by many hundreds of time, and those specifically of the tau type were the best of the lot. He was definitely interested.
<foo> for sure. throw me an email addr, ill get in touch tomorrow <foo> but fer now, im going bed. gotta get _some_ sleep
Abdul's friend, who went by the name of Qubic, had helped a lot with this particular project: he had set up the little cluster of computers sitting on the desk, and written the software to get them talking and working on the modelling system. Now, he'd tracked down someone whom Abdul would need if he wanted to try the model in practice, someone who could provide these particles to speed up the process. Abdul resolved to buy the man a drink sometime.
Not now, though. For the moment, he wanted sleep. He stood up, nearly overbalancing but managing to correct himself. Stumbling over to the mattress which lay on the floor across the room, he fell into it, sending a few stray papers scattering across the floor and up a couple of feet into the air before they swung back down.
]]>The way it had been before was this: a patchwork of locally-administered generation plants, perhaps a few serving each city, mostly using methods which hadn't changed for hundreds of years. By far the most common had been the heating of water in order to spin magnets, which had compound problems: use of a dirty fuel to heat the water, resulting in massive pollution, and the inherent inefficiency of the process. People in the power business once talked about 30% as if it was a good figure.
Of course, major problems arose if a city came close to running out of power. Because each of these local generation plants was fixed, and it was hugely expensive to build another one, the residents often faced such concepts as the rolling blackout instead of having a reliable supply. Such an idea sounded like a ridiculous way to run a city.
It took the invention of two seemingly seperate technologies to change the status quo. The first was practicable nuclear fusion: the creation of a little star in order to feed off its output. The Earth as a whole seemed to get along just fine with fusion, lapping up the rays of the Sun; if the planet could do it, the human race could adapt the technique.
The final race to build a viable fusion plant had been between China and the United States, but the breakthrough came from neither quarter, but from a research laboratory in France. The lab had been approached by both sides, but refused to give the plant away in exclusivity; instead, the technology was released for the world to use.
There was just one problem: size. The fusion plant was incredibly large, and no city could viably set aside such a huge chunk of land to house what was essentially a giant metal sphere. It took another invention before that issue could be alleviated.
It was called the HCD by the research team who came up with it: the High-power Collimating Diffractor. Its original target was satellite TV, where large amounts of power were wasted by spreading a signal over spaces which would never have a reception dish. Instead, the HCD split the signal into focussed beams, which could then be directed every which way, towards a specific dish on the ground or to another HCD for more refined splitting.
The real breakthrough hadn't come until Rihanna Johnson had her Eureka moment: generating radio waves from a fusion plant, and splitting with a HCD. The generation of radio from fusion was a simple enough matter, and had been done before; the plant gave out incredibly intense light, which could be shifted down to radio using technologies 50 years old. That wasn't the main thrust of the idea, though; the word which made the world sit up and take notice. That word was: space.
If fusion plants were so massively large that there was no way to house them on land, they could be hoisted into space instead. From there, they could generate high-powered microwaves, split into millions of fractions by a network of HCDs in orbit around the Earth, which could then beam the power down to reception dishes in every town and village.
It was a radical idea, not least because it meant abolishing the local generation infrastructure that had been painstakingly put into place over hundreds of years. The item which eventually forced the issue was, of course, cost: the price of all fossil fuels was steadily rising, and at some point in the late 21st century, all the accountants had worked out at the same time that it would in fact be cheaper to set up the fusion network than to keep the old plants running.
That was the gist of the history lesson Ryan gave in his unofficial capacity as tour guide for Fusion Pacific. He was a microwave researcher by profession, but most people up here had to take multiple jobs, simply due to a lack of personnel; it fell to him to show the tourists around.
Hawaii was the tethering point for Fusion Pacific: a huge cable, stronger than diamond, attached the land mass of the Earth to the giant ball of the power plant thousands of miles above. The space elevator had already been in construction when the fusion network had been floated as an idea; since the original plan had been to simply leave the end of the cable free in space, the fusion plant was deemed to add only a small percentage to the total mass of the system were it coupled to the endpoint.
There were two other stations in the fusion network: America, attached somewhere in the Amazon, and Eurasia which was fixed to an island off the coast of India. These had been the original three elevators, and three fusion plants was more than enough for all the world's needs.
The tour normally consisted of a trip around the plant, starting at Earth side and working around to star-side while Ryan explained the history of the network. While working back to Earth-side, the tourists could examine output graphs from the fusion plant if they so desired, or fiddle with a sample light-frequency HCD that Ryan had put together. Instead of splitting microwaves, it split red light into hundreds of thin beams, programatically directed at will; it made for interesting lighting, if nothing else.
]]>Winter wasn't supposed to be this cold, he was sure.
As the flames struggle upward before him, fighting a stiff breeze which threatens to drown everything in cold, he thinks about what happened to bring his life to this point. Of course, the bomb had been the start of it; as it had so abruptly ended so many lives, so it had ended any chance he had at life. Little Steph, one of the things that had been right about this world, his loving daughter: vaporised, along with the rest of Seattle, when the CAM warhead had struck all that time ago.
His brain tells him it has been but three weeks since that day when the War began, but every other part of him feels aged; he had lost any part of his mind that felt anything, otherwise he knows that would feel the pain too. He had been out of town that day, some kind of business at the Portland branch; as he sat at his desk facing the window, a sudden blinding flash had him covering his eyes for a moment. Every piece of electronic equipment in the office had been rendered dead by the flash, and a few people knew what that might mean.
Compressed anti-matter. Those bastards across the Atlantic had perfected their new weapon, and felt that somewhere in the States was a good place to test it.
He knew immediately that he'd never be able to get back into the state, but he tried anyway; his car wouldn't let him in (probably the electronics in there had gone too), so he ended up jumping on a bus. He was promptly deposited, about an hour later, at the state line, stopped by a police cordon. That confirmed it; nothing else would have brought out such a response. It could only be a CAM bomb.
So, that was it, then. The War with the Kingdom had started; the madman Thompson had decided it was time to drop the bomb that would end the world. The States would retaliate, of course; leaked documents had been splashed all over the news a few months ago about a secret research project into some new compressed form of anti-matter. "CAM", they called it, and everyone knew that the President was just insane enough to use it in a pre-emptive strike on the Kingdom. Looks like they had a project into this CAM stuff, as well.
And indeed, the bombs had flown across the Atlantic, vaporising Manchester and Leeds; the reply had been to annihilate much of the eastern seaboard, and so the two mad leaders successively ordered the destruction of each other's demesnes. Eventually, probably because the CAM had run out, the missiles stopped flying, and the smoking ruins of two countries were all that remained.
After about two weeks, he had finally been able to walk and hitch back to where Seattle had been. Those documents had been right about the power of CAM; nothing was left. Buildings, trees, people: in their place, nothing more than sand. The shop where Steph had worked was gone, along with the block and the streets around it; there wasn't even a strip of tarmac until a few miles out of downtown.
He didn't know why he'd come back. Perhaps he had some small, insane hope of seeing Steph again; deep in his mind, he knew that Steph was gone, but at that time he'd never admit to that. He had scavenged the suburbs for a week, seeing a few other people doing the same in that time, but he had spent most of that time alone.
The fire had burned down while he was lost in thinking. Now, a small pile of smouldering ashes sits in front of him; an occasional flicker of flame spurts up when an unburned piece of wood gathers enough heat to ignite. It's time to move on, find a warmer place to get some sleep, perhaps some food.
Sand. All that was left of his home.
]]>Of course, he had been working towards this night. He had been through many a simulation preparing, each one a virtual replica of the experiences and emotions that would pass through his mind; each one rendered by a network of computers, specifically ordered by the Organisation to reproduce the feel of the moment down to the smallest detail.
Whenever he would step into the immersion tank to enter the simulation, and don the mask which would feed his senses with virtual information, another world would flood into his brain: sights, sounds, touches all provided by the tank, on behalf of the simulation network. And it was a perfect world in many ways.
The chase would always be reproduced perfectly. The pretty young thing he would find in a deep level of a car park perhaps, or maybe in a back alley, near-deserted in the small hours of the night. Her eyes would widen in fear, the black pupils expanding as her flight response took hold, and then she would run. And he would follow, purposefully and without excess haste, for he always knew he would catch his prey before long.
The simulated victim would twist and turn through the streets of a deserted city, seeking a way to escape her inevitably approaching predator, but always he would be there: just behind, waiting for the slip to happen. And it would happen. She would stumble and fall, maybe, or take a bad turning and face a blank wall, and then the prey would be trapped; the implacable hunter on one side, the immovable stone on the other.
Tonight, he had been judged by the lead committee of the Organisation to be ready for a live chase. He had been transposed to a particularly run-down inner suburb of the city, its glory days long since blown into the winds by time and changing fashion. There he had seen his prey: a shapely young one, perhaps nineteen, with auburn hair spilling down towards her waist. She had sensed him somehow, turning back and showing those same wide eyes, dark pupils bordered by hazel; and then she had run.
The chase had been especially satisfying, lasting just long enough for him to be aroused, yet not dragging on until he would lose the urgent need to catch his prey. It had also been convoluted, mapping out almost every alley and street of that part of the city; he was sure that this one knew the area well, and was confident of shaking him off. At least to begin with.
But he knew that the chase would end in his favour, and towards the end she seemed to sense it too, her energy flagging, reserves failing as her body finally gave in to the panic rushing through her. And as he grabbed her by that length of auburn, a surge unlike any that had happened through the simulations passed through him. He knew that this was real.
The Organisation had trained him not to be one of those animals dispensing rape and murder on the world; the committee felt there were enough of those rabid dogs without their contribution. Instead, he was one of the artists: his speciality was the infliction of pain, delicate ribbons of flesh being cut slowly from his prey with surgical precision, as she writhed beneath.
At first, he had been too quick to dispense the pain, his victim lapsing into unconciousness almost immediately. But through the immersion tank, he had learned where to provide pressure, when to cut, to keep her hovering just concious, but still able to feel every ounce of the hurt he was working to inflict. And now his initiation had come, his arts being worked on a live subject for the first time, and he knew that this performance would be judged favourably by the committee.
But incongrously, there was always one thing the simulations lacked, and that was the flow of blood. The technicians said it was a simple problem of physics; the processing power to calculate the flow of liquids on such a precise scale simply wasn't there. How that could be, when the system gave such accurate impressions of the fear in his prey's eyes, he'd never be able to understand, but that was the simple fact of the matter.
And so it was that he was surprised by the sheer amount of blood released by his art tonight. It covered him and the stone floor of the final alley of the chase; it flowed freely from the limp body before him, which all the time was discharging more out of itself. The very air was tinged with salt, and he could taste metal in his breath if he opened his mouth. He was glad he had taken notice of his trainer's final remark before this night.
Take a change of clothing.
]]>We all know that a tree, like the one seen in Windows' File Explorer, is nothing more than a nested list. But is it possible to code a tree up in HTML/CSS as a nested list?
Let's start off with the list. This is a snippet of a standard file tree, organised in "folders".
<ul> <li> Graphics <ul> <li><a href='gpu.h'>gpu.h: Function prototypes</a></li> <li><a href='gpu.c'>gpu.c: Graphic output implementation</a></li> <li> Debug Output <ul> <li><a href='dbgout.h'>dbgout.h: Output prototypes</a></li> <li><a href='dbgout.c'>dbgout.c: Output fixed-width font drawing</a></li> <li><a href='font5x7.h'>font5x7.h: Fixed-width font definitions</a></li> </ul> </li> </ul> </li> </ul>
And this is what we get from that code.
Having a tree means that each branching node can expand or collapse, to
show or hide the elements of the tree within it. The showing and the hiding
isn't so difficult; the display
property in CSS allows us to
do this pretty quickly, if we define two classes:
ul.hide { display: none; } ul.show { display: block; }
Of course, it's not quite that simple. You can't change the state of a
UL
very easily (clicking on it won't do); but you can
change the state of an LI
. So we move the two classes to the
enclosing LI:
li.hide ul { display: none; } li.show ul { display: block; }
So, we have the CSS for hiding the tree. But how do we switch states?
How can we show and hide the nodes at will? That's where the DOM comes in.
If we put the description of the tree item ("Debug Output" for example")
in an active element (DIV
or A
maybe), we can
attach DOM events to it.
I've decided not to use A
, because an anchor requires a
href
, and using a link of #
will clutter up
your browser's History facility. So, let's use a DIV
.
What do we want to happen when we click the DIV
? Basically,
just flip the state of the parent LI
, such that the
UL
s underneath are visible.
<ul> <li class='hide'> <div onclick='toggle(this.parentNode)'>Graphics</div> <ul> <li><a href='gpu.h'>gpu.h: Function prototypes</a></li> <li><a href='gpu.c'>gpu.c: Graphic output implementation</a></li> <li class='hide'> <div onclick='toggle(this.parentNode)'>Debug Output</div> <ul> <li><a href='dbgout.h'>dbgout.h: Output prototypes</a></li> <li><a href='dbgout.c'>dbgout.c: Output fixed-width font drawing</a></li> <li><a href='font5x7.h'>font5x7.h: Fixed-width font definitions</a></li> </ul> </li> </ul> </li> </ul>
So when you click the "Graphics" DIV
, toggle()
runs and flips the top LI
from hide
to show
.
And of course, if you click it again, it flips back to hide
.
We'll need some JavaScript to do this; fortunately, JS gives us the ternary
operator, where we can select two options based on a condition.
function toggle(x) { x.className = (x.className=='show') ? 'hide' : 'show'; }
What this means is: "If the className is show
, set it to
hide
, otherwise [ie. if it's not show
] set it to
show
". Since there're only two possibilities for the class
name, you can see that this toggles between the two.
Just before we get to an actual working example, you should remember that
you'll have to define CSS for each level of menu that we go down, since
the properties won't inherit between UL
s if there's an
LI
in the way (which there always is).
So now, we can put it all together, and come up with a simple tree that is collapsible/expandable with a bit of DOM fiddling.
Here, I've just added some styling to the text DIV
, which can change
along with the parent LI
state just as the UL
does. Again, the inheritance of properties will be lost between levels,
so just put in an extra line for each level down.
Now we have just one problem. As you can see, the page loads with
the Graphics item collapsed. What if you don't have JS running? Click on
the DIV
and nothing happens; you can't get to the list
underneath! Obviously a problem. The way to alleviate this is to have
everything expanded by default instead of collapsed; if you need to,
use an onload
tree collapse so that the tree will collapse
if you run JS, and stay expanded if you don't.
function treeCollapse(){ var list = document.getElementById('yourtree').getElementsByTagName('li'); for(var i=0;i<list.length;i++) list[i].className = 'hide'; }
This'll just get a list of all the LI
s in the tree, and
set them to class hide
.
So, that's how to make a nested list into an expandable tree.
]]>For this quick run-through, I'll be assuming you know what the binary numbering system is, and how it works; furthermore, I'll assume a little familiarity with working in binary. Everyone who uses a computer, whether it be to connect to the Internet, for video games, or simply to edit a Word document, will have experienced the binary numbering system, as this is how all computers function internally. However, to make binary manipulations, you will need a slightly more advanced idea of how this system operates. I'll also be presenting any code examples using the syntax of C and its syntactical derivatives (C++, PHP, Java and the like). Don't worry if you don't know the C syntax for the operators; I'll be putting a small table at the end of the document.
The first operation to look at is called AND. It's called that because that's exactly what it does: take two inputs, and only return any output if both input 1 AND input 2 are on.
in1 | in2 | in1 AND in2 |
---|---|---|
0 | 0 | 0 |
0 | 1 | 0 |
1 | 0 | 0 |
1 | 1 | 1 |
That table is an example of a truth table; it plots out all the possible combinations of inputs, and what the operator will do with them. As you can see, the AND operator only returns 1 if both inputs are 1. But how does that help us in the real world, of numbers bigger than one bit?
The major thing that can be done with AND is masking: only
using the part of a number that you want to use. For example, let's say
you have a 32-bit variable, and you're incrementing it in a loop. But you
want to wrap the value after 255, back to 0. A simple case of using an
if
statement, you may think. But think again.
while(1) { i++;// How we'd do this with an if clause // if(i > 255) i = 0;// How to do it with ANDi = i & 255; }
What's happening here? Let's have a look at a normal case first; say
i
is at a value of 77. The AND operation looks like this:
00000000 00000000 00000000 01001101 00000000 00000000 00000000 11111111 [AND] ----------------------------------- 00000000 00000000 0000000001001101
The AND "mask" is essentially passing through the low 8 bits of
i
into the result. In this case, 77 fits fine into 8
bits, so no change. What happens at the borderline case: when 255
becomes 256?
00000000 00000000 00000001 00000000 00000000 00000000 00000000 11111111 [AND] ----------------------------------- 00000000 00000000 0000000000000000
As stated above, the low 8 bits of i
are being passed
through. That's 0. The rest of the value, including the '256' bit, is
essentially ignored by the mask, meaning the value automagically wraps
from 255 to 0. Quite useful, you'll admit.
Something else you can do with AND is to clear a certain portion of a value. For example, you want to check the network that an IP address lives on. An IP address is just another 32-bit number, and the subnet mask that accompanies it is exactly that: a mask, that's applied to the IP using the AND operator, to find the subnet for that IP address.
Let's take my setup at home. I have a few computers at home, and
they all have addresses in a private IP range. One of those computers is
172.16.55.37
, with a subnet mask of 255.255.255.224
;
how does the router know which network I'm coming from?
IP address: 10101100.00010000.00110111.00100101 Snet Mask: 11111111.11111111.11111111.11100000 AND: ----------------------------------- Subnet:10101100.00010000.00110111.00100000 [172.16.55.32]
So when the router sees a packet destined for 172.16.55.37, it applies this AND operation, works out that the network is in its route table, and forwards the packet. In other words, the Internet wouldn't work without the AND operator.
A final example for AND: When you want to make a Windows application,
you generally want to display a window, and windows can have various
styles associated with them. The most commonly used style is
WS_OVERLAPPEDWINDOW
, which is just a number given an
easier-to-read name: it combines various styles which tell Windows to
provide a caption, system menu, minimise and maximise buttons.
But what if you don't want a minimise button? You can build the
style you need by taking the normal one, and cutting out the value for
WS_MINIMIZEBOX
. The style values were set with just this
idea in mind: simple styles are binary power values, and complicated
styles combine them together.
So let's give that a go: a window with no minimise button.
OVERLAPPEDWINDOW: 00000000 11001111 00000000 00000000 MINIMIZEBOX: (00000000 00000010 00000000 00000000) Clear mask: 11111111 11111101 11111111 11111111 AND: ----------------------------------- Overall style: 00000000 11001101 00000000 00000000
If the new value is passed to Windows, you'll get a shiny new window with no minimise button. To generate the clear mask, you can use the NOT operator, which we'll be coming to later.
The second operator we'll look into is called OR. And as you can guess, it's called OR for a reason: it takes two inputs, and gives an output if either one or the other, or both, is set. Here's another of those truth tables.
in1 | in2 | in1 OR in2 |
---|---|---|
0 | 0 | 0 |
0 | 1 | 1 |
1 | 0 | 1 |
1 | 1 | 1 |
Note how it doesn't matter whether in1
is on or off; if
in2
is on, the result is on. And, of course, the inverse
applies. So, how can that be used in the real world?
The major application of OR is in setting bits; filling a portion of
a value with 1's. Let's take the Windows styles from above: you may want
not just a normal window, but a window which can handle vertical scrolling
of its contents. Luckily, it's simple to define such a window: just tack
the WS_OVERLAPPEDWINDOW
default style and
WS_VSCROLL
together, thus.
OVERLAPPEDWINDOW: 00000000 11001111 00000000 00000000 VSCROLL: 00000000 00100000 00000000 00000000 OR together: ----------------------------------- New window style: 00000000 11101111 00000000 00000000
By using AND and OR in this manner, it's quite easy to build up exactly the style of window you're looking for.
OR is the operation that allows you to set a bit if either or both of two inputs is set. But what if you only want to check either, and not both? There is an operator for that, and it's called the Exclusive OR, XOR for short. Another truth table for you:
in1 | in2 | in1 XOR in2 |
---|---|---|
0 | 0 | 0 |
0 | 1 | 1 |
1 | 0 | 1 |
1 | 1 | 0 |
Now, it might seem a bit esoteric and theoretical, to have an operator that only sets output if either input is set, and not both. However, the XOR operation does come in useful.
If you have a variable or a CPU register, and you want to clear it to 0, you don't care about what's inside; those contents will be obliterated anyway. Of course, you can simply move "0" into that variable, but in some cases that might be a problem. Instead, you can use XOR.
Random value: 11001001 01001110 11010010 11110001 Value again: 11001001 01001110 11010010 11110001 XOR: ----------------------------------- Output: 00000000 00000000 00000000 00000000
For every one of the 32 bits in that example, either the top or the bottom line of the truth table matched, which means the output was 0 in both cases. The end result is, of course, that all the bits of the value turn out as 0 after the XOR operation. Now, why would one want to perform such an operation? Take a look at this example, in Intel assembly.
mov eax, 0 ; Assembles to 5 bytes xor eax, eax ; Assembles to 1 byte
If you're at a premium for space, it's obvious which of the two you'd pick; instead of wasting 4 bytes of space, a bitwise operator can do the same job.
The other major application of XOR is to flip a bit, or range of bits,
within a value. If you take a look at the truth table in two halves, the
top half is essentially the bottom half upside-down; and that flipping
is controlled by the value of in1
. This comes in very useful
for certain situations.
For example, let's say you've been tasked with producing a square wave:
a constant series of pulses, 64 then 191, then 64, then 191. You could
implement a complicated series of if
statements, or you
could use a simple XOR.
char output = 64; while(1) { output = output ^ 255; usleep(500);// sleep for half the wave period}
So, what's happening here? The output value starts at 64, and gets changed by the XOR every time the loop runs. How does that change manifest itself?
output: 01000000 Mask: 11111111 XOR: -------- Result: 10111111
So we started at 64, and the XOR has flipped all the bits, to 191. After 500 milliseconds, the loop is re-entered, and the XOR gets applied again:
output: 10111111 Mask: 11111111 XOR: -------- Result: 01000000
And as if by magic, the XOR returns the result we entered with the first time: 191 gets flipped to 64. Because the value is 64 for 500 microseconds, and 191 for another 500 microseconds, the overall wave period is 1 millisecond, and we've produced the 1kHz square wave, with just one XOR operation.
If all you want to do is flip the whole of a value, as in the above example, there is an alternative. It's called NOT, and it's different to the rest of the bitwise operators we've covered so far. Instead of taking two inputs, it just takes one; the output is the opposite of the input.
x | NOT x |
---|---|
0 | 1 |
1 | 0 |
NOT is one of those more esoteric operations; it doesn't see much use. One place where it can be used to good effect is to generate masks for use in the AND operation. Let's say you want to mask off all the bits of a value except the bottom two. Instead of hassling around with long strings of binary or hexadecimal digits, we can let bitwise operators do the grunt work for us.
value = value & (~3);
Alright; short, succinct, but what's it doing? Let's take a look at the binary level.
Three: 00000000 00000000 00000000 00000011 NOT 3: 11111111 11111111 11111111 11111100 Value: 11001001 00011110 10001101 00111011 AND: ----------------------------------- Result:11001001 00011110 10001101 00111000
In this way, you can easily generate the masks you need for your AND operations, without having to dig deep into hexadecimal strings and remembering any combination tables; NOT's doing the hard work for you.
So far, the bitwise operators have been 1-bit affairs: take one bit from an input (or two), give one bit of output. There are operators, however, that take a whole string of binary digits, and do simple operations with that. The major example of those operators is the shift.
Shifting comes in two variants: left and right. As you can probably guess, a left shift takes a binary string of values and shifts it left, pushing it up the binary powers. Let's take a look with an example.
Before: 00001111Left 1: 00011110 Left 2: 00111100 Left 3: 01111000
The 1's get pushed up from the bottom of the value, towards the top; the gaps that are left are filled in by 0's. If you look carefully, you'll note that this is essentially equivalent to multiplying the value by powers of two: before shifting, we had 15. Shift left by 1 and we end up with 30; left by 2 gives us 60, and left by 3 gives 120.
However, what happens if we start with a 1 near the top, and then start shifting left? Let's take a look.
Before: 01100101Left 1:11001010 Left 2:10010100 Left 3:00101000
The value got shifted along sure enough, and 0's were inserted at the bottom, but where did the upper 1's go? The answer is, they vanished; basically, they fell off the end of the value as a result of the shifting operation. If you decide to use shifts, keep in mind that this will happen if you run out of space for the bits.
Also note how the most-significant bit changes during the successive shift operations. If you're dealing with signed values, the most significant bit is often used to denote the sign (1 meaning that this is a negative number); in this case, your number is going from positive to negative, then back to positive! That is one disadvantage of the shift operation that has to be kept in mind when you use it.
The right shift is very similar to the left shift, only it acts in the reverse direction: bits get moved to the right, and 0's are inserted from the left hand side.
Before: 00111000 Right 1: 00011100 Right 2: 00001110 Right 3: 00000111
Just as left shift can be thought of as multiplication by a binary power, right shift can be used to divide by a binary power; in the case above, 56 becomes 28, then 14, then 7, simply by successive shifts to the right.
Just as with the left shift also, the right shift provides no safeguards if 1's are at the low end of the scale. In the example above, another right shift would yield a value of 3, since the lowest 1 simply drops off the end of the value.
Also just like the left shift, there's no safeguard on the highest bit; if it started out as 1 (which in a signed value denotes negative), a right shift would fill in 0's, making the value positive. For that reason, processors and programming languages often offer an "arithmetic" right shift operation, which fills with 0 if the top bit was 0, and fills with 1 if the top bit was 1; by using the arithmetic right shift, the sign of the value is preserved.
That's all well and good, throwing values around inside a binary string. But to what end can it be put? How can shifts be used in the real world?
Let's say you have in your possession four 8-bit values, and you wish to build a 32-bit value by tacking them all together. The left shift makes this a very easy task to accomplish.
Values: 11001010, 00111010, 01001101, 00110011 Pushing all values into 32-bit variables: var4: 00000000 00000000 0000000000110011var3: 00000000 000000000100110100000000 <-- left shift 8 var2: 000000000011101000000000 00000000 <-- left shift 16 var1:1100101000000000 00000000 00000000 <-- left shift 24 OR: ----------------------------------- Final: 11001010 00111010 01001101 00110011 value = var4 | (var3 << 8) | (var2 << 16) | (var1 << 24);
Similarly, if you wanted to split that 32-bit value up into 8-bit chunks, you could simply perform a successive right-shift by 8, in conjunction with an AND operation to mask off the part of the result you're looking for.
Initial: 11001010 00111010 01001101 00110011 Mask: 00000000 00000000 00000000 11111111 AND: ----------------------------------- var4: 00110011 RShift: 00000000 11001010 00111010 01001101 Mask: 00000000 00000000 00000000 11111111 AND: ----------------------------------- var3: 01001101 RShift: 00000000 00000000 11001010 00111010 Mask: 00000000 00000000 00000000 11111111 AND: ----------------------------------- var2: 00111010 RShift: 00000000 00000000 00000000 11001010 Mask: 00000000 00000000 00000000 11111111 AND: ----------------------------------- var1: 11001010 var4 = value & 255; var3 = (value >> 8) & 255; var2 = (value >> 16) & 255; var1 = (value >> 24) & 255;
A little bit long-winded, at least when written down in raw binary. Keep in mind, though, that the computer can do this at a stupidly high pace, especially when you use operators designed to work with bits.
Another use of shifts is to multiply values by multiplers that aren't direct binary powers. This isn't such a common operation any more, since multiplier units are quite quick nowadays, but if you ever come across a weird sequence of shift operations in some old code, you'll know what it's doing, and why it's there.
In the old VGA days, the most common graphic mode used on PCs had 320 pixels to a line. Pixels were written to the screen in a flat framebuffer: a portion of memory which translated directly to the screen, 320 bytes to a line. That meant finding out a location in memory, given an X and Y coordinate to draw to. The formula is pretty simple, but the multiplier units of the CPUs back then were quite slow, so every advantage was sought out.
Memory Location = (y * 320) + x = (y * 256) + (y * 64) + x = (y << 8) + (y << 6) + x
Since shifts were orders of magnitude faster than multiplies, this series of two shifts and two adds was done much more quickly than one multiply and one add. Of course, you probably won't see this so much any more, since CPUs actually got good at multiplying, but it's possible you'll come across this kind of technique in another place.
So, that's all the bitwise operators you'll come across in your travels through programming. Some of them may come in very useful, and some may be used less often, but have no doubt: they'll all be used sometime.
A final note: This table shows how C-syntax languages denote the bitwise operators.
Operator | Symbol | Syntax |
---|---|---|
AND | Ampersand | x & y |
OR | Vertical pipe | x | y |
XOR | Caret | x ^ y |
NOT | Tilde | ~ x |
Left shift | Two less-than signs | x << y |
Right shift | Two greater-than signs | x >> y |
Placed into the public domain by Imran Nazar, 2006.
]]>The problem is that, because programs are just files like any other files, they can be changed: certain numbers modified, or new streams added in the middle of the file. This is one way in which viruses are able to infect a file; they may add to a program file, and the next time that program is run, the virus is executed and performs the functions it was designed to do.
One of the best ways to alleviate this is to use a technique known as the 'message digest': a number which describes the entire contents of a file. A simple example would be an algorithm like the following:
If each letter of the alphabet is assigned a number, A being 1, B being 2 and
so forth, it's possible to add all the letters in a text file together, to
obtain a "check-sum"; a sum of the contents of the file, which can be used to
check it. As an example, HELLO WORLD
would add up to 127.
One of the problems with this simplest example of message digest algorithm is that the space could be removed entirely, and the checksum would be the same; also, the message could be changed and more content added, as long as the total of all the letters was still 127. As a result, more complex algorithms have been developed to take these issues into account.
When these methods are used on a program file, they generate a number which is unique to the combination of numbers within that file. If anything changes, like an instruction being changed or instructions being added by a virus, the digest number will change.
This can be used to detect viruses every time a program is run. The process is relatively simple: when the program is first installed, a digest is generated based on that program file. Every time the program is used, the digest is again calculated, and if the numbers differ, some outside agent has changed the file.
One of the most popular implementations of this system is used inside Windows, known as File Protection; it's used on files which are important to the system, such as device drivers. If Windows detects that one of the files has a different digest to that which it knows about, the user will be alerted to the fact that a system file has been changed. Many other systems are also in place in other software packages to perform similar functions.
]]>It started innocently enough. I was looking for a small ROM which would be used to test the framebuffer display mode of DSemu, a Nintendo DS emulator. LiraNuna agreed to put a small C demo together, to fill the 'main' screen with red, demonstrating the framebuffer's use. When compiled and spliced up, the .nds ended up at around 7.5KB.
That, LiraNuna thought, was a bit large for something that did so little as his demo evidently did. Stepping through with DSemu's debugger, I noticed a whole lot of code being run which wasn't strictly required: setting up cache parameters and the stack, clearing out regions of memory, and such like. Referred to as the crt0, this code is inserted into every project, to safeguard the execution environment.
Furthermore, there was the standard ARM7 code also inserted into the .nds file, which does such things as set up the touchscreen. All this, we thought, was a bit over-the-top for a demo that was literally doing almost nothing. So, the cut-down began.
First off, LiraNuna thought the functionality of the ARM7 wasn't particularly required for this demo. So, the thought process went, why not simply tell that 'sub' CPU to enter an infinite loop and not do anything? The reasoning was sound, and so the ARM7 source file was replaced with a simple assembly file, looking something like this.
main: b main
Once put together, that reduced the size of the overall .nds file by quite a way; down to approximately 5KB. However, I still thought that was a touch large. A quick peek into the .nds file showed why that was: the sub CPU, just like the main CPU, has a crt0 automatically inserted by the build process, and this made up the vast majority of the ARM7 portion of the .nds file.
Therefore, LiraNuna took the step of subverting a part of the build process, by deleting the result of the ARM7 compilation, and replacing it with a straight binary file, encoding the infinite-loop opcode.
0000 FE FF FF EA
That left the overall binary at around 4KB. Still plenty of room for improvement, I thought.
The main code was still in C, and compiled to Thumb binary. Stepping through that in DSemu's debugger, I noticed a few odd things introduced by the compiler, that seemed to do very little; values being left-shifted and then right-shifted again, to no overall effect, and similar oddities. So, the next logical step was to write that portion without the intervention of the compiler, in assembly.
LiraNuna put together a first attempt at an assembly version of the program, as follows.
main: @ sets POWER_CR mov r0, #0x4000000 orr r0, r0, #0x300 orr r0, r0, #0x4 mov r1, #0x3 str r1, [r0] @ sets mode mov r0, #0x04000000 mov r1, #0x00020000 str r1, [r0] @ sets VRAM bank a mov r0, #0x04000000 add r0, r0, #0x240 mov r1, #0x80 strb r1, [r0] @ loop mov r0, #0x06800000 mov r1, #0x1F orr r1, r1, #0x8000 mov r2, #0x18000 filloop: strh r1, [r0], #0x1 subs r2, r2, #0x1 bne filloop lforever: b lforever
When compiled up, that definitely made a difference; the overall ROM size dropped to approximately 1.5KB. However, I started to have an inkling that we could do better. And that's when pepsiman piped up with a suggestion: place the code inside the .nds header.
What did pepsiman mean by that? In order to understand that, it's important to know what a .nds ROM looks like, on the inside.
File offset | Component |
---|---|
0000 | NDS ROM header (512 bytes) |
0200 | ARM9 binary |
0200+ARM9 | ARM7 binary |
0200+both | Optional file table |
The conventional layout dictates that the main CPU's binary be placed after the header, and the sub CPU's binary after that. However, that doesn't have to hold true all the time; the order can be swapped, blank space can be inserted between the binaries, or after them.
That's all well and good, but inside the header? In order to understand that, it's required to look inside that top chunk of the file: the ROM header.
0x00 Game title 0x0C Game code #### 0x10 Maker code 0x12 Unit code 0x00 0x13 Device type 0x00 0x14 Device capacity 0x00 (1 Mbit) 0x15 (8 bytes blank space) 0x1E ROM version 0x00 0x1F reserved 0x04 0x20 ARM9 ROM offset 0x200 0x24 ARM9 entry address 0x2000000 0x28 ARM9 RAM address 0x2000000 0x2C ARM9 code size 0x3A0 0x30 ARM7 ROM offset 0x600 0x34 ARM7 entry address 0x3800000 0x38 ARM7 RAM address 0x3800000 0x3C ARM7 code size 0x8 0x40 File name table offset 0x608 0x44 File name table size 0x9 0x48 FAT offset 0x614 0x4C FAT size 0x0 0x50 ARM9 overlay offset 0x0 0x54 ARM9 overlay size 0x0 0x58 ARM7 overlay offset 0x0 0x5C ARM7 overlay size 0x0 0x60 ROM control info 1 0x00586000 0x64 ROM control info 2 0x001808F8 0x68 Icon/title offset 0x0 0x6C Secure area CRC 0x0000 (-, homebrew) 0x6E ROM control info 3 0x0000 0x70 (16 bytes blank space) 0x80 Application end offset 0x00000000 0x84 ROM header size 0x00000200 0x88 (36 bytes blank space) 0xAC PassMe autoboot detect 0x53534150 ("PASS") 0xB0 (16 bytes blank space) 0xC0 Nintendo Logo (156 bytes) 0x15C Logo CRC 0x9E1A (OK) 0x15E Header CRC 0xC9D3 (OK) 0x160 (160 bytes blank space)
The entries highlighted red indicate regions of empty space in the header structure. These are normally left behind during the construction of the format, to allow for expansion. In this case, however, it's possible to make use of the blank regions in the header for the purposes of holding code.
From looking at the above output, it's simple to see that the structure
of the .nds file as a whole is dictated by the entries in this header.
The fact that the ARM9 binary follows the header is simply due to the setting
of "ARM9 ROM offset" to 0x200
, which is the first byte in the
file after the header. Similarly, the ARM7 code following the ARM9 is a
simple effect of the "ARM7 ROM offset" being set to 0x600
, which
corresponds to an offset in the file of 1.5KB.
Simply by changing the "ROM offset" values in this header, it's possible to change the point from which the code for the CPUs is loaded, from the default location after the header to somewhere inside the header; overwrite the zeros in that position with ARM opcodes, and load from there. It seemed a good idea by pepsiman, and viable.
LiraNuna's ARM9 code seemed quite short, but I thought I could go one better, shrinking the code down further.
main: mov r0,#0x04000000 ; I/O space offset mov r1,#0x3 ; Both screens on mov r2,#0x00020000 ; Framebuffer mode mov r3,#0x80 ; VRAM bank A enabled, LCD str r1,[r0, #0x304] ; Set POWERCNT str r2,[r0] ; DISPCNT str r3,[r0, #0x240] ; VRAMCNT_A mov r0,#0x06800000 ; VRAM offset mov r1,#31 ; Writing red pixels mov r2,#0xC000 ; 96k of them lp: strh r1,[r0],#2 ; Write a pixel subs r2,r2,#1 ; Move along one bne lp ; And loop back if not done nf: b nf ; Sit in an infinite loop to finish
Once assembled, this code ended up looking like the following.
0000 01 03 A0 E3 03 10 A0 E3 02 28 A0 E3 80 30 A0 E3 0010 04 13 80 E5 00 20 80 E5 40 32 80 E5 1A 05 A0 E3 0020 1F 10 A0 E3 03 29 A0 E3 B2 10 C0 E0 01 20 52 E2 0030 FC FF FF 1A FE FF FF EA
Definitely a little smaller; now the matter remained of where to put it,
along with the ARM7 binary of one opcode (EAFFFFFE
). The ARM7
was simple enough: the first region of blank space, 8 bytes, was ample space
to place this opcode. The ARM7 offset was changed, the size changed to 4,
and that part was done.
The ARM9 code was similarly simple to place in: the 160 bytes of free space at the end of the header seemed more than enough to stash the binary, and all that remained was to modify the ARM9 ROM offset and size.
And that, it seemed, was that. All the code fit comfortably into the header, and the final .nds was just 512 bytes in size. Surely that was all that could be done? Not quite.
As it turns out, not all 512 bytes of the header are used. The 160 bytes on the end are in the header simply by convention; one might as well say that the .nds file consists of a 352-byte header, 160 bytes of padding, and then the two CPU binaries. Was it possible to fit the 56-byte ARM9 binary somewhere else inside the header, and eliminate this padding?
I started by changing the "header size" field at 0x84
to
reflect the new size of the header, which would be 0x160
bytes.
Then, I started inserting the opcodes, until I had something like this.
0070 01 03 A0 E3 03 10 A0 E3 02 28 A0 E3 80 30 A0 E3 0080 00 00 00 00 A0 01 00 00 04 13 80 E5 00 20 80 E5 0090 40 32 80 E5 1A 05 A0 E3 1F 10 A0 E3 03 29 A0 E3 00A0 B2 10 C0 E0 01 20 52 E2 FC FF FF 1A 50 41 53 53 00B0 FE FF FF EA 00 00 00 00 00 00 00 00 00 00 00 00
The fields in the header at 0x80, 0x84 and 0xAC can be seen, nestled within the ARM9 code. Now, this is quite a problem; if those values correspond to valid opcodes, they may be executed, and that might prove disastrous for the state of the program.
A disassembly was called for. I loaded up the new binary in DSemu, and the debugger gave the following output:
mov r0,#0x04000000 mov r1,#0x3 mov r2,#0x00020000 mov r3,#0x80 andeq r0, r0, r0 andeq r0, r0, r0, lsr #3 str r1,[r0, #0x304] str r2,[r0] str r3,[r0, #0x240] mov r0,#0x06800000 mov r1,#31 mov r2,#0xC000 lp: strh r1,[r0],#2 subs r2,r2,#1 bne lp cmppls r3, #0x14 nf: b nf
It seems I was fortunate. The first two AND statements will never be executed, since they depend on the ZERO flag being set, and said flag is not set by the instructions above. As for the CMP, it slots into place after the VRAM-writing loop, which is indeed fortunate; if the CMP had fallen before the BNE, the loop may have executed forever, eventually running out of VRAM to write to.
Surprisingly fortunate, I thought; I hadn't planned for such a consequence, and it had simply come about due to the size and structure of the code. Either way, I wasn't about to complain.
So, there we have it. The smallest .nds file you're ever likely to see, which still does something. The ARM7 sticks itself into an infinite loop, and the ARM9 fills the main-core framebuffer with red before entering its own infinite loop. I eventually got my wish, of a small framebuffer-testing demo, but it was fun to get there.
0000 4E 44 53 2E 54 69 6E 79 46 42 00 00 23 23 23 23 NDS.TinyFB..#### 0010 00 00 00 00 00 00FE FF FF EA00 00 00 00 00 04 ................ 0020 70 00 00 00 00 00 00 02 00 00 00 02 44 00 00 00 p...........D... 0030 16 00 00 00 00 00 80 03 00 00 80 03 04 00 00 00 ................ 0040 A0 01 00 00 00 00 00 00 A0 01 00 00 00 00 00 00 ................ 0050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0060 00 60 58 00 F8 08 18 00 00 00 00 00 00 00 00 00 .`X............. 007001 03 A0 E3 03 10 A0 E3 02 28 A0 E3 80 30 A0 E3.........(...0.. 0080 00 00 00 00 A0 01 00 0004 13 80 E5 00 20 80 E5............. .. 009040 32 80 E5 1A 05 A0 E3 1F 10 A0 E3 03 29 A0 E3@2...........).. 00A0B2 10 C0 E0 01 20 52 E2 FC FF FF 1A50 41 53 53 ..... R.....PASS 00B0FE FF FF EA00 00 00 00 00 00 00 00 00 00 00 00 ................ 00C0 C8 60 4F E2 01 70 8F E2 17 FF 2F E1 12 4F 11 48 .`O..p..../..O.H 00D0 12 4C 20 60 64 60 7C 62 30 1C 39 1C 10 4A 00 F0 .L `d`|b0.9..J.. 00E0 14 F8 30 6A 80 19 B1 6A F2 6A 00 F0 0B F8 30 6B ..0j...j.j....0k 00F0 80 19 B1 6B F2 6B 00 F0 08 F8 70 6A 77 6B 07 4C ...k.k....pjwk.L 0100 60 60 38 47 07 4B D2 18 9A 43 07 4B 92 08 D2 18 ``8G.K...C.K.... 0110 0C DF F7 46 04 F0 1F E5 00 FE 7F 02 F0 FF 7F 02 ...F............ 0120 F0 01 00 00 FF 01 00 00 00 00 00 04 00 00 00 00 ................ 0130 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0140 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0150 00 00 00 00 00 00 00 00 00 00 00 00 1A 9E 7B EB ..............{.
http://imrannazar.com/content/files/TinyFB.nds
Two9A, with thanks to LiraNuna and pepsiman
]]>