You have probably already derived the correct equations, but let's go through it anyway.
You want a rigid transformation that translates (x1, y1) to (x3, y3), and (x2, y1) to (x4, y4).
We can think of the rigid transformation as a translation, followed by a rotation. We'll start with a translation that moves (x1, y1) to (x3, y3). This is easy:
(a, b) ---> (a + x3 - x1, b + y3 - y1)
will do the trick.
Now, we want a transformation that rotates around the point (x3, y3) so that the point originally at (x2, y1), which has been mapped over to (x2 + x3 - x1, y3), rotates over to (x4, y4).
Let's work out the angle of this rotation. Recall that the direction of a vector (x, y) can be determined by the atan2(y, x) function (most math libraries have a atan2 which determines the quadrant of the vector as well as its arctan - don't use the simple atan( y/x ) function for this).
So the total rotation is just the difference between the two angular directions, that is
theta = atan2( y4 - y3, x4 - x3 ) - atan2( x2 - x1, 0 )
(remember that we need vectors that start at the point (x3, y3), so this needs to be subtracted from both of the points).
Okay, now, we simply apply a rotation operation.
(c, d) -----> ( c * cos(theta) - d * sin(theta), c * sin(theta) + d * cos(theta) )
where (c, d) = (a + x3 - x1, b + y3 - y1), from before.
If you are finding that you might be off by a few pixels near the edge of your image, you're suffering from a resolution problem: the true locations of your two points aren't really (x3, y3) and (x4, y4), but those are just the closest co-ordinates. This does not affect the translation, since the same (sub-pixel) error just gets propagated throughout the image. But for the rotation, a small error in the rotation near (x4, y4) could turn into a larger error farther away. Basically, the error will be something around the resolution of your image divided by abs(x2 - x1). If x2 and x1 are close together, you can see why you might be off by a few pixels.
There is really no good way to correct for this, other than getting more accurate co-ordinates for (x3, y3) and (x4, y4), or getting mappings for more than two points, and doing a best fit on the translation and rotation you want.
Hope this helps!