The general form of an operator wrt a basis is the analog of a matrix with respect to a specific coordinate system. Consider a rotation. You must specify the basis and then you can write down the rotation matrix.

When going to an infinite dimensional vector space

Now we specifically ask what the representation of the momentum operator is wrt the position basis. The result:

|
Aside: the argument x-x’ or x’-x is irrelevant
|
Using this in the above equation

|
Aside: Note that using the theory of functions that you can operate to the left or right with this representation by integrating by parts. This will provide a negative sign for left operation. |
The representation of the momentum operator as a derivative can be confusing. Let us examine the relationship in more detail to see if this is indeed reasonable
The first step is to imagine the form of the translation operator. Here we simply we simply consider a 4 components out of an n-dimensional space. We examine how the components xi are related to the translated components yi.

If one steps on step forward by one step (Choose the step to be x2-x1 so that the old component of basis e1 is now the component of basis e2.)

This is a shift of the function over one step. Similarly if
you go to the continuum and consider the step ![]()
![]()
The only contribution is for the state exactly a distance
away.
Now for small ![]()
(don’t worry
about i or the sign)
Where we are looking to be able to express the translation
in terms of some property of the current state times the amount or step for the
translation. This separates
and its properties from the amount of the translation. Operators that have this characteristic are
referred to as generators. They
determine the amount-of-change/translational-step required.
In general, one can write an operator in a format similar to the matrix formulation used for finite dimensional vector spaces.

The use of the unit vectors in the matrix representation of
an operator is not always used. Notice the matrix formalism requires the
determination of all the
’s and to use these elements to find the ith new component by
looking at all of the contributions from each part of the original vector. The particular
contribution of the jth component to the ith is
. Detailing which
parts of the old and new vector are included is the same as projecting out the
contributions with an inner product. Thus the matrix representation could also
be written as
![]()
Now we are looking for a way to separate the amount from
some generic features of the operator.
From the intuitive standpoint a rotation involves a certain type of
operation performed over a given angle. Rotation can be separated from the angle. One knows what a rotation is without
specifying how much rotation is involved.
Can one mathematically separate operations in the same fashion? For operations that are continuously
connected to the identity one can try and use a
carries the essence of the translation and the parameter
provides the
amount. For very small steps the
translation becomes
![]()
For large steps the translation takes the form
(Factors of i
and
have been dropped.)
If you want to write a translation in terms of some generic
result times the step then the operator cannot depend on the step size. Clearly
to first order you could imagine that the slope at that point times the change
would give you the
total change at that point. The first term 1 just involves keeping the amplitude
for each state the same. The second term
finds the change per
step multiplied by the step size. Examine the effect on the value y2 based on
the original vector x. We need to find the change in the amplitudes and we base
it on the change to the right averaged with the change to the left. There might be


In principle one can imagine a matrix that links the adjacent states in such a way as to calculate the change and then use the change as the first approximation in computing the new vector by adding the change to the current vector. It is an amazing feature of function theory that the value of a function can be computed at an arbitrary distance away if the function and all its derivatives at some point are known. This feature then allows us to take rotations, time evolution and special translations and express them in terms of generators and the amount.
This discussion was meant to review some aspects of the
definition of the momentum operator
in terms of the derivative
. This note is meant
to help clarify how the form of this operator emerges as a derivative.
One nice feature of this development is that one can look at the matrix derived above

This is another way to see that the adjoint of the derivative operator changes sign. The above matrix is a REAL ANTISYMMETRIC matrix so the transpose is minus the original. Note that if I multiply this matrix by I then the minus sign generated by the antisymmetry is canceled by the minus sign generated by the complex conjugate. Thus
![]()
The adjoint is the same as the original.
Another way to examine the sign is to look at translations and ask how the two functions that enter the inner product when expressed in position space must translate to keep the inner product unchanged.
Lets us first review the situation for a 2-d space and the rotation operator.
Consider the standard dot product of two vectors.
![]()
Now imagine that we want to rotate a vector C and then take the dot product. I can write this in a similar notation as
rotate C through an
angle θ and then take the dot product with A.
I can preserve the dot product but allow the rotation to occur on either C or A. So I define what I mean to rotate A by demanding that the dot product doesn’t change. In this case if I rotate A by –θ (minus theta) the dot product remains the same.

Notice that if I use matrix multiplication

You can multiply to the left or right. The result of the rotation matrix on x,y is a rotation through θ while a,b times the matrix results in a rotation in the opposite direction - θ.


We see that the notation using row and column vectors and 2x2 matrices allow for matrix multiplication either to the left or to the right. You can see that the effect of a matrix on a column is different than on a row and that the difference when the matrix represents a rotation corresponds to a change in sign for θ.
Let us remember that the translation of a function

What happens if we translate a function A to the right by an amount ε?

We could indeed also move the function B and get the same overlay so long as we moved it in the opposite direction. So in order to keep the inner product the same as we apply a translation either to the BRA or KET space we must step in the opposite direction.
move B to the left.
move A to the right.
This shows that we need to apply a different rule to the BRA space for a derivative so that translations are in opposite directions in each space.

we need to remember to
add a minus sign to the usual definition of the derivative operator when we
apply it to the left.
![]()
This is the matrix element of the derivative operator. The default meaning is that it operates to the right with a positive sign as stated above and to move it to the left one needs to add a minus sign.
There are two intertwined questions:
The adjoint is the form of the operator that would produce the same vector in the bra space as was produced by the original in the ket. The translation operator for the ket space moves a state ε to the right. So the operation on both A and B in ket space results in right translated states. However the right translation operator must translate a bra to the left. Thus

In other words if the operator produces a different impact on the bra space then on the ket space what is the operator that has the same effect.