CH 431/Lecture 2
From WolfWikis
| CH 431 |
|---|
| Lecture 1 |
| Lecture 2 |
| Lecture 3 |
| Lecture 4 |
| Lecture 5 |
| Lecture 6 |
| Lecture 7 |
| Lecture 8 |
| Lecture 9 |
| Lecture 10 |
| Lecture 11 |
| Lecture 12 |
| Lecture 13 |
| Lecture 14 |
| Lecture 15 |
| Lecture 16 |
| Lecture 17 |
Contents |
LECTURE 2
This lecture is the only part of the course that is not taken straight from the book. It is a necessary insert to be able to move on to chapter 17 of the book that deals with the beginnings of statistical thermodynamics. This topic is unthinkable without some basic concepts from probability theory and quantum mechanics.
We shall briefly review a few of the very basic concepts of mathematical statistics and then do a sneak preview of quantum mechanics, a topic found in the first 15 chapters of the book that will be revisited extensively in CH 433.
Probability distributions
A random variable X can have more than one value x as an outcome. Which value the variable has in a particular case is a matter of chance and cannot be predicted other than that we associate a probability to the outcome.
Probability p is a number between 0 and 1 that indicates the likelihood that the variable X has a particular outcome x.
The set of outcomes and their probabilities form a probability distribution.
There are two kinds of distributions:
- discrete ones
- continuous ones
Total probability should always add up to unity.
Discrete distributions
| S&McQ cf. 63 |
A good example of a discrete distribution is a true coin. The random variable X can have two values:
- heads (0)
- tails (1)
Both have equal probability and as the sum must equal unity, the probability must be ½ for each.
'The probability that X=heads' is written formally as:
- Pr(X=heads) = Pr(X=0) = 0.5
The random function is written as a combination of three statements. (Mathematicians usually combine them by putting {..} around the statements but this is beyond my wiki-skills, so I use yellow box instead)
|
Indistinguishable outcomes
When flipping two coins we could get four outcomes: two heads (0), heads plus tails (1), tails plus heads (1), two tails (2)
Each outcome is equally likely, this implies a probability of ¼ for each:
|
The probability of a particular outcome is often abbreviated simply to p. The two middle outcomes collapse into one with p=¼+¼= ½ if the coins are indistinguishable. We will see that this concept has very important consequences in statistical thermodynamics.
If we cannot distinguish the two outcomes leading to Xtot=1 we get the following random function:
|
Notice that it is quite possible to have a distribution where the probabilities differ from outcome to outcome. Often the p values are given as f(x), a function of x. An example:
|
X3 defined as:
|
The factor 1/6 makes sure the probabilities add up to unity. Such a factor is known as a normalization factor. Again this concept is of prime importance in statistical thermodynamics.
Another example of a discrete distribution is a die. If it has 6 sides (the most common die) there are six outcomes, each with p= 1/6. There are also dice with n=4, 12 or 20 sides. Each outcome will then have p= 1/n.
Moments of distributions
| S&McQ 64-65 |
An important aspect of probability distributions are the moments of the distribution. They are values computed by summing over the whole distribution.
The zero order moment is simply the sum of all p and that is unity:
- <X0> = ΣX0*p= Σ1*p= 1
The first moment multiplies each outcome with its probability and sums over all outcomes:
- <X> = ΣX*p
This moment is known as the average or mean. (It is what we have done to your grades for years...)
For one coin <X> is ½, for two coins <X> is 1. (Exercise: verify this)
The second moment is computed by summing the product of the square of X and p:
- <X2> = ΣX2*p
- For one coin we have <X2> = ½,
- For two coins <X2>= [0*¼ + 1*½ + 4*¼] = 1.5
|
The notation <xxx> is used a lot in quantum mechanics, often in the form <ψ*ψ> or <ψ*|H|ψ>. The <.. part is known as the bra, the ..> part as the ket. (Together bra(c)ket)
|
Intermezzo: The strange employer You have a summer job but your employer likes games of chance. At the end of every day he rolls a die and pays you the square of the outcome in dollars per hour. So on a lucky day you'd make $36.- per hour, but on a bad day $1.-. Is this a bad deal? What would you make on the average over a longer period? To answer this we must compute the second moment <X2> of the distribution:
(I have taken p=1/6 out of brackets because the value is the same for all six outcomes)
|
As you see in the intermezzo, the value of the second moment is in this case what you expect to be making on the long term. Moments are examples of what is know as expectation values.
Another term you may run into that of a functional. A functional is a number computed by some operation (such as summation or integration) over a whole function. Moments are clearly an example of that too.
Continuous distributions
| S&McQ 66-67 |
Now consider a spherical die. One could say it has an infinite number of facets that it can land on. Thus the number of outcomes n = ∞, this make each probability p = 1/∞=0.
This creates a bit of a mathematical problem, because how can we get a total probability of unity by adding up zeros?
Also, if we divide the sphere in a northern and a southern hemisphere clearly the probability that it lands on a point in say, the north should be ½. Still, p = 0 for all points...
The solution requires calculus.
We introduce a new concept: probability density over which we integrate rather than sum. We assign an equal density to each point of the sphere and make sure that if we integrate over a hemisphere we get ½. (This involves two angles θ and φ and an integration over them and I won't go into that).
A bit simpler example of a continuous distribution than the spherical die is the 1D uniform distribution. It is the one that the Excel function =RAND() produces to good approximation. Its probability density is defined as
- f(x) = 1 for 0<x<1
- f(x) = 0 elsewhere
The figure shows a (bivariate) uniform distribution.
The probability that the outcome is smaller than 0.5 is written as Pr(X<0.5) and is found by integrating from 0 to 0.5 over f(x).
- Pr(X<0.5) = ∫ f(x).dx from 0 to 0.5 = ∫ 1.dx from 0 to 0.5 = [x]0.5-[x]0 = 0.5
Notice that in each individual outcome b the probability is indeed zero because an integral from b to b is always zero, even if the probability density f(b) is not zero.
Clearly the probability and the probability density are not the same. Unfortunately the distinction between probability and probability density is often not properly made in the physical sciences.
Moments can also be computed for continuous distributions by integrating over the probability density
Another well-known continuous distribution is the normal (or Gaussian) distribution, defined as:
- f(x) = 1/[√(2π)σ] * exp(-½[(x-μ)/σ]2)
(Notice the normalization factor 1/[√(2π)σ])
We can also compute moments of continuous distribution. Instead of using a summation we now have to evaluate an integral:
- <X> = ∫ [f(x)*x] dx
- <X2> = ∫ [f(x)*x2] dx
For the normal distribution <X> = μ
|
Energy levels
Quantization
Your car can drive at an infinite number of speeds v, say between 0 and 70 mph. Its kinetic energy Ekin = ½ mv2 can therefore also assume an infinite number of values. You'll probably drive 35 mph more often than 70. So we could describe your driving habits with a continuous probability density function.
On atomic level this is often not the case. Particularly if you confine a small particle, say a He atom, in a small space its distribution of possible speeds is discrete. Only certain energy levels occur, others simply do not, just like a die never returns 3.14159, just integer values like 3 or 4. There kinetic energy is quantized (compare to the difference between an analog and a digital signal or image).
Duality
| S&McQ 15-18 |
The reason for quantization is that:
Did I lose you here? Sounds pretty crazy I know. After all these years I still have problems wrapping my mind around this idea. I have a hard time imagining my car as a wave. Luckily for all practical purposes it is not. It is too big and heavy to notice its wave character. The same thing is true when you are bowling. I have never seen a diffraction pattern there, have you? However, if I replace the bowling ball by a beam of neutrons or electrons and the pins by the molecules of a crystal I get a beautiful diffraction pattern. Just as if the beam is a beam of light shone on a grating or a CD.
Waves have wavelengths and phases and they undergo interference. If neutrons and electrons do the same they must also have a wavelength. This wavelength is known as the de Broglie [də brɶj] wavelength λ.
- λ = h/mv
The value of h (Planck's constant) is very small, only 6.626 10-34 Js!!
This formula immediately explains the cars and bowling balls: their mass m is so big that the wavelength becomes negligibly small and wave effects are not observed. The reason is that Planck's constant h is terribly small.
(Exercise: calculate your own wavelength when traveling at 55 mph answer). Notice that the wavelength changes when the velocity or Ekin changes.
Standing waves in a 'box'
Waves have another property: when confined they form standing waves. These have discrete characteristics. A good example is the string of e.g. a harp. It produces a particular tone. When pinched off in the middle we hear the octave (twice the frequency). Harpists call this a flageolet. On woodwinds the same thing is achieved by 'overblowing'. On many types of instruments it is possible to get the 3rd 4th and higher harmonics (octave+ fifth, double octave etc.). Tones in between are not produced, unless the length of the resonant body (string, tube) is changed.
The reason that only particular waves can resonate on a string of fixed length is that the standing wave pattern must fit between the points where the string is fixed, because at the edges the wave motion needs to be zero (resulting in a node at both ends).
This requires that the wavelength (actually half the wavelength) must fit an integer times between these limits. The same thing happens if we put a small particle in a box. Its half-wavelength must fit in the box an integer times as well. However, this can only happen if its kinetic energy has particular values.
As you can see the wavelengths of the 'captured' waves are 2a, 2a/2, 2a/3, 2a/4 etc. if the 'box' they are trapped in has a length a. In general λn = 2a/n.
The integer n in the denominator is known as a quantum number. It is an integer that runs from n=1,2,3,4,5,6,7,8,9,... to infinity.
This can only happen if its kinetic energy has particular values because for a particle:
- Ekin = ½ mv2
- and:
- λ = h/mv
A bit of algebra shows that
- Ekin = ½ h2n2/4ma2. = [h2/8ma2]*n2
As you can see the kinetic energy should increase quadratically with the value of the quantum number n.
Size changes
As we said above only certain wavelengths are allowed for a given size of the box. We can change the allowed wavelengths by changing the size of the box. Musicians do that all the time. On a guitar you clamp the string on a fret, on a recorder you open an extra hole, even on a harp you can step on a pedal that activates a mechanism to shorten or lengthen the string to sharpen or flatten the tone.
We can see from the factor h2/8ma2 what happens to the energies of our particle waves, if we increase the side a of our box. Let's look at the energy difference ΔE between a state with n=2 and n=1: ΔE= E2-E2= h2/8ma2[22]-h2/8ma2[12] =3h2/8ma2
Obviously if a increases ΔE becomes smaller. Consequently if the box size increases the energy levels become more closely spaced.
Potential energy
As you see we only considered kinetic energy for our particle wave in a box. Actually that is not quite true. We did consider a potential energy Vpot in a somewhat hidden way. We assumed that box had walls where the particle could not go. This is the same thing as assuming the following potential energy function:
|
There is a way to calculate the Energy values for more complicated potential energy functions that will be treated in CH433. It involves solving a differential equation known as the Schrödinger equation. For the moment the above derivation of the box result suffices.
Ideal gases, 3D boxes
The above 'derivation' only took one dimension into account, but our gases are typically three dimensional.
For a 3D box with sides a,b and c we can find a similar expression

If the vessel is a cube with side a this becomes.
This formula contains the integers n1,n2,n3, i.e. we have three quantum numbers. They tell you how many times the wavelength of the particle fits in the box in each of the three directions. They also determine in what energy state the particle (read: wave) is in.
Degeneracy
| S&McQ 94 |
Notice that (n1,n2,n3)= (1,2,1) and (1,1,2) and (2,1,1) produce the same energy provided the vessel is a cube. These three states are said to be degenerate. Obviously this degeneracy is a consequence of the symmetric shape of the vessel, because in the rectangular box these states do have a different energy. If we were to change (distort)the shape of the box to a parallellopiped the three energies E121, E112, E211 would become different, i.e. the degeneracy is lifted.
Translational motion, degrees of freedom
| S&McQ 519 |
The particle in a box model is a good model for the translational energy states of a gas molecule in an enclosed volume as long as the gas can be considered ideal. Translation is simply the movement in a straight line. For a monatomic gas that is all the gas molecule (read: atom) can do: travel in either X,Y or Z. Each atom is therefore said to possess three translational degrees of freedom
For polyatomic gas molecules the constituent atoms can not exercize their translational degrees of freedom independently. They only have three translational degrees per molecule (not:atom). The missing df's are replaced by vibrational and rotational degrees of freedom.
Rotational and vibrational energy states
Polyatomic molecules have two more kinds of energy states:
- rotational
- vibrational
At very high temperatures or under irradiation (UV e.g.) it is also possible to excite the electrons of the atoms. This adds a fourth term:
- electronic degrees of freedom
The energy states of all of these can be described by quantum mechanics involving quantum numbers. Once we define their potential energies we can use the Schrödinger equation and solve for their energy states.
The imporatnt thing for now is that all these energies go in steps, just like for the box problem. However the step sizes ΔE that separate the levels are rather different. The difference in energy is determined by the factor [h2/8ma2] for translational motion. In general that is a pretty small number particularly if the box is big. For the other degrees of freedom the determining factor depends on such things as the moment of inertia of the molecule or the strength of the bond between the vibrating atoms. These factor tend to be much bigger. Typically we have:
- ΔEtrans<<ΔErot<<ΔEvib<<ΔEelec
We will give the two simplest models for rotational and vibrational energy without further explanation (See CH434):
- Rigid rotor:
- Erot = B*J*(J+1); quantum number J= 0,1,2
- Harmonic oscillator:
- Evib = h *nu* (v+ ½); quantum number v= 0,1,2,…
ν. It is used for the frequency.
Unfortunately, it looks much like the latin letter v (vee) used as a quantum number in the same formula. In fact this wiki makes them look identical...
- Be sure to know the difference...:
- Greek nu: frequency with units of inverse seconds, its size reflects bond strength and the mass of the molecule
- Latin vee: quantum number, dimensionless integer that indicates which energy state the system is in.
- Be sure to know the difference...:
Statistical thermodynamics
Thermal energy
A confined monatomic gas can be seen as a box with a whole bunch of atoms in it. Each of these particles can be in one of the states given by the last formula. If all of them have large n values there is obviously a lot of kinetic energy in the system. The lowest energy is when all atoms have 1,1,1 as quantum numbers.
Boltzmann realized that this should relate to temperature. When we add energy to the system (by heating it up without changing the volume of the box) the temperature goes up. At higher temperatures we would expect higher quantum numbers, at lower T lower ones.
But how exactly are the atoms distributed over the various states?
This is a good example of a problem involving a discrete probability distribution. The probability p that a certain level n= ( n1,n2,n3) with energy En is occupied should be a function of temperature: p(T), but which one?
| S&McQ 693ff |
This is the point where we rejoin the book at Chapter 17.
Boltzmann postulated that you could look at temperature as a form of energy: the thermal energy and that the thermal energy is directly proportional to temperature (in Kelvins of course).
- Ethermal = kBoltzmannT
The proportionality constant k (kb) is named after him: the Boltzmann constant. It plays a central role in all statistical thermodynamics.
The Boltzmann probability factor
He further postulated that the probability that a certain energy level Ej is occupied depends on the ratio Ej/Ethermal..
If Ej >> Ethermal the level is unlikely to be occupied.
Boltzmann further determined how the probability pj depends on this ratio. He postulated that:
- pj is proportional to exp(-Ej/Ethermal)
The exponential expression on the right is know as the Boltzmann factor.
Of course he had his reasons to pick an exponential function and we will consider that later.
If we add up this exponential over all states j, we get:
- Q = Σ exp(-Ej/Ethermal)
In general this sum Q is not equal to one. As argued in the statistic section above that means that we need to divide by Q as a normalization factor to find proper probabilities.:
pj = exp( -Ej/Ethermal) /Q
This factor Q plays a key role in statistical thermodynamics. It is known as the partition function (Much more about it below).
Why the Boltzmann factor is an exponential
The way Boltmann showed that the probability function is exponential involved a pretty complicated statistical argument. McQuarrie & Simon present it in a pretty 'hand waving' way in section 17-1 and you should read that carefully.
Boltzmann's own argument involves maximum likelihood.
Suppose you have 100 particles. The total energy is say 1000 units and each particle has levels 0,1,2,.. etc.
Of course one possible -but extreme- distribution is that one particle has an energy E=1000 and all the others E=0. However, it is far more likely that they all have energies around E=10 (some 11, others 12). Particularly if you have a gigantic number of particles extreme distributions become so unlikely that it is a miracle if they indeed occur.
Boltzmann showed that his distribution (the one with the exponential) is simply the most likely one. Yes fluctuations around his most likely distribution can occur, but if you have Navogadro particles (one mole) the deviations are pretty limited.
McQuarrie and Simon give a rationale why the most likely distribution should involve and exponential function.
- Take a very large amount of entities (e.g. gas molecule but it's more general than that). Call this a canonical ensemble.
- Divide the ensemble in a large number of systems that are in equilibrium with each other.
The energy of two systems E1 and E2 is not necessarily the same. The relative number of systems in state 1 and 2 (a1 and a2) should be some function of their energies:
- a1 / a2 = f(E1, E2)
Now energy is an additive quantity with an absolute zero point. That means that the above ratio should really be a function of the energy difference. (Yes, critical people call this hand waving. Would you prefer a couple of pages of boring math?).
- a1 / a2 = f(E1 - E2)
Now we have a ratio on one side and a difference on the other. Mathematics teaches us that you can easily turn one into the other using an exponential (or a logarithm if you go the other way). Thus, the function f should be exponential.
- f(E) = eβE
We do not have a rationale why β should be the reciprocal of the thermal energy (1/kT) as yet, but this can be shown as well.


