An example of a partial function that is not a...

An example of a partial function that is not a total function. (Photo credit: Wikipedia)

unfortunately you wont find this in a schoolbook. but truth is maths is very limited in middleschool.
one important concept you learn there, after basic set-theory, is the notion of a function.

given 2 sets, you “map” each element of the first set onto whatever element you choose from the 2nd.
no need for all elements of the 2nd set to have something mapped to it.
graphically you just draw arrows from the elements of the first set onto elements of the 2nd.
this way quite naturally a new set is created: the image of your function. that’s a subset of the 2nd set.
if you only map part of the elements of the 1st set onto something, you get a partial function.
a partial function quite naturally creates another set: the definition range. it’s a subset of the 1st set.
an actual (Total) function has domain (that’s what the 1st set is called) and definition range being one and the same set.
also functions are distinguished by the properties “injective” and “surjective“.
injective is a function where no element in the image has two or more “arrows” pointing to it.
surjective is when the image isn’t just subset of the set the function is mapping to. it’s when the full set is covered by the image.
bijective function has both properties, it’s injective and surjective. and therefore one can define an inverse function: with arrows pointing into the opposite direction.

a much better way to depict a function is to draw its values. visualize it so the viewer can predict what values it will have.
a common way to depict functions mapping points of a line or plane onto another line or plane, is to draw the points into a bigger coordinate system. that picture is called a Graph of that function.
just designate 1-2 coordinates for the domain, and draw a point in the remaining coordinates according to the function’s output.

if it’s a function from real numbers onto a 2d-plane, it’s more common to draw a curve in that plane.
maybe also add some arrows and markings to the curve to show the sequence in which the points get added.
of course, if you’d just draw the same function into 3d-coordinates as a graph, you’d get something different.
but just look at the 3d-image from the side into witch the coordinate is pointing which you did choose for the domain.
projecting the 3d-curve onto a plane orthogonal to that you get such plane with a 2d-curve in it. you get that drawing I described in the beginning of this paragraph. so it isn’t entirely different.

all this works great for smooth functions. the viewer can just imagine all the points inbetween the ones you draw.
one must take care that such points inbetween are what the viewer does expect though.
one must choose wisely what part to show, how much to magnify the function.
calculating the extrema of a function has, among others, the purpose of acquiring that knowledge.
so lateron in the text I’ll talk a bit about derivatives.

a function is just a glyph (or whatever decoration) along with 2 sets and some description.
this glyph is representing the function’s name.
as above the 1st set is the Domain, and the 2nd set is the value-range. the set of values the function might output, the image, is a subset of the latter.

you might remove a single element from the value-range, one that isn’t in the image. and technically you get a new function.
this way a function is more alike to a procedure in a computer-program.
also there altering the type of variables used as input or output, you basically get a new thing.
even when the code defining it hasn’t changed.

another similarity to computer-programming is the notion of a variable.

to write the “description” of a function mostly formulas are written.
but the description could also be a set of touples, representing that “arrow” you’d see when depicting the function as 2 sets connected by such a mapping.
a third possibility is to write some algorithm, alike to an actual computer-program. and sometimes the description will be just some plain text.

no matter how your function is described, in front of the description you see something alike to “f(x)=“.
there the “f” is the glyph used for the function, it might be a greek letter or a word, even hebrew letters might be used.
and the “x” is the variable, alike to variables in programming languages. again the variable is written as another glyph that might even be greek or hebrew and/or have various decorations.

more exactly a function might not be that new a concept at all, whenever we use some device we encounter that principle.
you just do something and something else you will get in return, input and output, cause and effect.
however, what really is new in middle-school is the idea of variables and formulas.
you have some text, written in whatever language, maybe computer-program, maybe plain text, maybe in the language of mathematics or logic.
in that text you have strange letters or whatever glyphs, maybe whole words, that somehow don’t make sense.
but in the context of describing a function those are meant to be seen as variables.
to the reader it means that their meaning will be defined lateron. for now there is some info on what they might contain though.
so when you have to evaluate a function, you read something alike “f(3)“, it means that in the function’s description that started with “f(x)=“, after the “=” you’ll have to replace x by the value 3, and then you’ll read that altered description again to learn what value the function will output.
and it gets even more complicated when you see the glyph used for the function as a variable too. maybe the function isn’t given in a defining way? maybe that function-name is part of a formula?
I say, learn programming! once you can do that, this aspect of maths shouldn’t be a problem.

well, that’s not all, there’s another concept to learn in middleschool, starting already at the beginning of mathematical education.
it starts out as multiplication. it continues with division and polynomials and their roots, and finally ends with trigonometry. all these things are really just about exponentiation.

the fundamental claim about natural numbers is that you just combine prime-numbers by multiplication to get everything.
sometimes the same prime number must be repeated several times, so you abbreviate this by exponentiation. for example 27=3^3=3\cdot3\cdot3.
this imposes a new operator onto the natural numbers. same operator can be extended to real numbers and complex numbers.
an operator, binary in this case (since it works on 2 variables), is just a function like above.
so in middle-school there are 3 binary operators: plus, times, and “to the power of”. one unary function there is too: \ln x or {_e\!\log x}
please note that subtraction and dividing are not listed because they are among those 3 operators!
to subtract you just need to multiply one number with “\text{-}1“. to divide you take something to the power of “\text{-}1“.

the important formulas are:
a-b=a+b\cdot(\text{-}1) and {a \over b}=a\cdot b^{\text{-}1}. and keep in mind (\text{-}1)\cdot(\text{-}1)=1 as well as (a^b)^c=a^{b\cdot c} and a^b\cdot a^c=a^{b+c} and a^c\cdot b^c=(a\cdot b)^c.
that’s just the beginning. in middle-school you also learn about roots, most prominent the square-root \sqrt a. but for each exponent there is a root inverting its exponentiation.
again it is no omission I didn’t list that together with the other 3 operators. the basic formula here is \sqrt[b] a=a^{1\over b}=a^{(b^{\text{-}1})}.

the brackets I put there because exponentiation differs in one important aspect from times and plus: it makes a big difference how you put the brackets when several exponentiation-operations are chained together.
i.e. a^{(b^c)}\ne(a^b)^c. so be careful and make use of brackets in such cases.
and also a^b\ne b^a explains why exponentiation shouldn’t be written as an operator alike to “^”.
we simply are used to swapping around the input to binary operators.

finally, 2D-trigonometry is handled by complex numbers in combination with exponentiation.
a complex number is just a term of the form a+b\cdot i and the rule that i\cdot i=i^2=\text{-}1.
quite prominent is the formula e^{\pi\cdot i}=\text{-}1 where e is the euler number.
now the euler number actually isn’t a number, it isn’t rational number and it cannot be expressed through polynomials or their solutions.
in that respect it is much alike to the number \pi, which in turn is merely the circumference of half a circle of radius 1.
so while \pi is described by approximating the half-circle’s circumference, e is described by (1+n^{\text{-}1})^n=({n+1\over n})^n with n being a fixed number as close to infinity as one can get.
but much more enlightening about exponentiation is the formula e^x=\sum\limits_{k=0}^{\infty}{x^k\over k!}=1+x+{x^2\over 2}+{x^3\over 6}+{x^4\over 24}+\cdots=1+x(1+{x(1+{x(1+{x(1+\cdots)\over 4})\over 3})\over 2}).
it is enlightening because it also works for x being a rational or complex number. actually this formula is where all the stuff about “roots are just exponentiation” or “dividing is same as to the power of -1” comes from.
this formula makes exponentiation into a function, an unary function, a function in a single variable.

to get totally minimalistic one could define a^b=e^{b\cdot\ln a} for positive numbers. when a is negative, think of it as a^b=(\text{-}1)^b\cdot(\text{-}a)^b=e^{\pi\cdot i}\cdot e^{b\cdot\ln a}=e^{b\cdot\ln(\text{-}a)+\pi\cdot i}.

This is a demonstration that Exp(i*Pi)=-1 (cal...

This is a demonstration that Exp(i*Pi)=-1 (called Euler’s formula, or Euler’s identity). It uses the formula (1+z/N)^N –> Exp(z) (as N increases). The Nth power is displayed as a repeated multiplication in the complex plane. As N increases, you can see that the final result (the last point) approaches -1, the actual value of Exp(i*pi).

sounds quite complicated.
but take a look at this formula: R\cdot e^{\varphi\cdot i}.
and now imagine \varphi to be the length of a small part of a circle, part of a circle with radius 1.
the output of this formula is a point on a circle of radius R and same angle as this small part of a circle \varphi did measure.

the output is a complex number.
a point in the complex plane. a plane made up of (x,y) for each complex number x+i\cdot y
and positive small angles this formula will map to the upper right quarter of that plane.
counterclockwise with angle \varphi=0 being mapped onto the positive half of the x-axis.

therefore one can imagine e^{b\cdot\ln(\text{-}a)+\pi\cdot i} as the formula (\text{-}a)^b=e^{b\cdot\ln a} rotated by half a circle, rotated by 180°.

so no actual exponentiation is needed, just the two functions \exp(x)=e^x and its inverse function \ln(x)={_e\!\log x}.
I repeat, it would be sufficient to have just plus, times as operators and \exp and \ln as unary functions.

and being minimalistic might sound like a funny useless game, but here it plays an important role for thinking abstractly:
the concept of dualities is quite prominent in maths. sometimes a duality is between 2 opposites, sometimes it’s between 2 similar things.
here we have both, a duality between the 2 operators, and a duality between a function and its inverse function.
additionaly there seems to be a duality-alike relationship between binary operator and unary function.

another unary function I already mentioned and even used above: \ln. the inverse function to exponentiation of the euler number (e^{\ln x}=x).
it also has the property that no matter what number you take to the xth power, you can still retrieve the original x with the help of \ln.

the way to do it is by taking advantage of the general formulas above and dragging them over to the \ln function.
this way you get \ln(a\cdot b)=\ln(a)+\ln(b) and \ln(a^b)=b\cdot\ln(a), useful formulas when coping with this function.
it’s because e^{\ln(a\cdot b)}=a\cdot b=e^{\ln(a)}\cdot e^{\ln(b)}=e^{\ln(a)+\ln(b)} and e^{\ln(a^b)}=a^b=(e^{\ln(a)})^b=e^{\ln(a)\cdot b}.
so when you have a^x=b given, you can get x by applying \ln to both sides: x\cdot\ln a=\ln b
that’s where the formula {_a\!\log x}={\ln x\over\ln a} comes from.
in computer-programming often {_2\!\log} is used instead of using \ln because the processor has some machine-language command built in for that and maybe not for \ln.
as you probably can see, the formulas I’ve proven here can be used for any \log-function, not just \ln!

another property of \ln is a bit beyond the scope of what one learns in middle-school.
in the complex plane, one can see that this function is actually defined everywhere, except on (0,0), but only locally.
i.e. given any complex number, nearby the function is defined everywhere.
you just are not allowed to include (0,0) and you are not allowed to have a hole in your definition range which would contain that point.
you always must leave a ray or a curve (which preserves the order for all of its absolute values) starting at that point and going into infinity. in this stripe the function would behave really strangely.
so there are not just all the different \log functions, each of them, including \ln, has different versions of their definition range, and the descriptions being altered accordingly.
this isn’t surprising when you know that exponentiation is not an injective function in the complex numbers.
thereby it isn’t bijective, no inverse function exists. so what values does it assume that wont make it injective?

let’s go back to the discovery that R\cdot e^{i\cdot \varphi} is counterclockwisely describing a circle of radius R.
obviously this means
e^{i\cdot \varphi}=\cos\varphi+i\cdot\sin\varphi and
but why a circle? as I promised, the formula e^x=1+x+x^2\cdot(2!)^{\text{-}1}+x^3\cdot(3!)^{\text{-}1}+x^4\cdot(4!)^{\text{-}1}+\cdots, without witch using complex numbers for x wouldn’t make sense, that formula is an explanation.
observe what happens when you plug in i\varphi, and what happens when you plug in \text{-}i\varphi:
i goes through 4 states in this formula, i^1=i then i^2=-1, and the same again with a “-“.
i^5=i and later in the sum the same pattern repeats on and on.
so looking at those 4-5 terms is enough:
pretty much the same, just the Imaginary part has become negative alltogether.

changing the sign of the imaginary part of a number (while leaving the real part as it is) is called conjugation.
it is described by drawing a line over the complex number. i.e. \overline{x+i\cdot y}=x-i\cdot y
therefore \text{-}i\varphi is conjugation of i\varphi. surprising is that also \overline{e^{i\varphi}}=e^{\text{-}i\varphi}=e^{\overline{i\varphi}}. thereby more generally \overline{e^z}=e^{\overline{z}}.

suppose e^{i\varphi}=x+i\cdot y. then also e^{\text{-}i\varphi}=x-i\cdot y
when you multiply both you get 1=e^0=x^2+y^2.
that’s the well-known formula for the circle of radius 1. so all points of e^{i\varphi} are on that circle!
think about it: x^2+y^2 are always constantly 1, no matter what \varphi you put into e^{i\varphi}!

exponentiation is well known for its strong growth, the bigger the real number the stronger the growth will be.
and it is well known that exponentiation will never be zero. but can it be negative?
obviously in the real numbers it can’t. it has positive values and is never zero. how can it cross the zero-point?
complex functions are difficult to depict, a function from 2d space to 2d space.
however, let’s say the input is a line in the complex plane, and output is the complex plane. the result is visible in 3d.

beware, this method will create a slightly biased picture. the line you choose determines how the thing will look like.
take a straight line, and you get a curve in 3d. this curve moves in space as you move the line on the complex plane.
all those parallel lines in the domain make up a surface. it would seem other lines could be mapped to the same surface.
the perpendicular line to those parallel lines would then seem like a curve in the plane orthogonal to the axis you choose for depicting the domain.
but this curve likely is not the graph of an actual function. it’s just a mapping of real numbers onto a plane, a curve.
also you could use the graph of a helper-function in the domain instead of a straight line. those helper-function-graphs can be moved along a straight line pointing into the direction you did use for depicting function-values of your helper-function.
no intersections will happen. so a whole new shape will be created, depending on that helper-function you choose.

for example the exponential function. in the direction of the real line we all know this function.
also I have said here that e^{i\varphi} is a mapping from real numbers onto a circle in a plane.
so that’s what you’d get mapping the exponential function with the axis of the real numbers as input:
an exponential graph rotated around the x-axis. each y-z-slice orthogonal to that x-axis is a full circle.
however, just look at what you’d get if choosing the imaginary axis instead of the real one.
e^{i\varphi} would become a curve spiraling on a cylindrical surface, never changing size.
parallel lines look the same, just different size. that size grows/shrinks exponentially.
and if you’d use a logarithmical graph as a helper-function the result might look even more differently.
i.e. \exp(\ln y+iy)=e^{\ln y}\cdot e^{iy}=y\cdot e^{iy}, similar shape but linear instead of exponential growth.

however, making a 3d-animation where the angle of the straight line used would rotate, could give a good impression.
unfortunately I haven’t seen any program that could display such a 3d-film, even less create one…

what you get this way is some mix between exponential function and \sin and \cos.
those trigonometric functions are definitely not injective, they periodically repeat themselves. so does \exp.

the real values it assumes are positive and negative numbers. since e^{i\pi}=\text{-}1, its square will be 1 again.
multiply e^{i\pi} another time and you get back to -1. in general e^{n\cdot 2\pi i}=1 for all whole numbers n.
this makes \exp non-injective, each and every value gets assumed infinitely many times.
but things aren’t that bad. knowing \ln for a truncated \exp function is enough. so just cut off all the values that repeat and define the domain accordingly for a total function.
this way usually \ln is defined for all numbers except zero and a ray starting in zero going along the negative real numbers.

in the formula-collection you’ll find the definition:
\ln x=2\cdot({x-1\over x+1}+({x-1\over x+1})^3\cdot 3^{\text{-}1}+\cdots)=2\cdot\sum\limits_{n=0}^\infty({x-1\over x+1})^{2n+1}\cdot(2n+1)^{\text{-}1}) for x>0
there also are formulas for 0<x<2, and they might look a lot more simple.
the problem lies in how they got created: pick a point and you’ll get it defined within a circle around that point.
since it cannot be defined at zero, such attempts will always give very limited results…

for a function to calculate the negative half, just take such a \ln\text{-}x and add \pi\cdot i to its output, thereby rotating by 180°. i.e. \widetilde\ln x=\pi\cdot i+\ln\text{-}x will be then defined for x<0.
just use \ln and \widetilde\ln depending on where you are looking at.
so in addition to \sin\varphi=\text{Im} e^{i\varphi}={e^{i\varphi}-e^{\text{-}i\varphi}\over 2i} and \cos\varphi=\text{Re} e^{i\varphi}={e^{i\varphi}+e^{\text{-}i\varphi}\over 2}, we also now can write \arcsin x={\ln(ix+\sqrt{1-x^2})\over i} and \arccos x={\ln(x+\sqrt{x^2-1})\over i}

but forget trigonometric functions! do everything 2-dimensional directly on a calculator (a calculator with the ability to calculate complex numbers).
type R\cdot e^i\varphi to get a point with distance R and angle \varphi, translate Polar Coordinates to Cartesian Coordinates.
use \ln(x+iy) to calculate \ln R and \varphi. the former in the real part, the latter in the imaginary part of the output.
add to that the knowledge that scaling up a triangle will also scale up each of its lines by the same factor.
maybe some pythagoras (x^2+y^2=R^2) and you have trigonometry covered.

a bit more subtle is the idea of calculating the derivative for a function. what is it for?
abstractly seen the process of creating a derivative, Differentiation,  is just a function.
it takes another unary function as input and outputs a function in the same variable.
it is written by drawing a small vertical line above and next to the function, or just a dot above.
it transforms the function according to certain rules.

since there only are 2 operators and 2 functions, defining derivative is easy:

every term that is connected by plus to other terms gets handled individually. the derivative of a finite sum is the sum of individual derivatives. (a(x)+b(x))'=a'(x)+b'(x)
product are more complicated. a product becomes the sum of the same amount of products of same size. and additionally in each such term of the output a different term of the product is picked out and the derivative of that is calculated — the other terms of the product stay the same as without derivative. (a(x)\cdot b(x))'=a'(x)\cdot b(x)+a(x)\cdot b'(x) or (\prod\limits_{k=1}^N a_k(x))'=\sum\limits_{k=1}^N(\prod\limits_{l=1}^{k-1}a_l(x))\cdot a_k'(x)\cdot(\prod\limits_{l=k+1}^N a_l(x))
the variable x to the power of a constant becomes that constant times x to the power of a number that is the constant minus one. (x^c)'=c\dot x^{c-1}
the euler-number to the power of x is already its own derivative, nothing changes. (e^x)'=e^x
the derivative of \ln x is x to the power of -1. (\ln x)'=x^{\text{-}1}
a function evaluated at the output of another function is the product of the derivatives of both functions. keep in mind that the outer function must first get the derivative applied before you insert the 2nd function into its variable. (a(b(x)))'=a'(b(x))*b'(x)

another way to write the derivative is f'(x)= {\partial f\over\partial x}(x).
in this last rule, let’s say b(x) is inside of the variable u.
then in this new way of writing the rule would like this: {\partial\over \partial x}a(b(x))={\partial a\over\partial u}\cdot{\partial u\over\partial x} (x)

but in case you come upon a function that isn’t those 4 functions combined someway, there is a much more general definition.
take a look at {f(x)-f(x_0)\over x-x_0} and imagine x to be very close to x_0.
when you draw the graph of f (input in the x-coordinate, output in y), and you draw a graph of x\cdot f'(x_0) (a line through zero), you will notice in x_0 they both have exactly the same growth.
the reason is that {y-d\over x}=k is the same as the k in kx+d, the line through (0,d).
so, when f(x_0)=d and x_0=0, then {f(x)-f(x_0)\over x-x_0}={f(x)-d\over x}.
for each x a line is depicted, an angle is chosen. the closer x comes to x_0, the better such a line approximates the actual function’s angle in that point.

one thing I should say about the geometry of linear functions:
all linear functions are of the form y=kx+d or implicitly ax+by=c.
both are the same, k=\text{-}{a\over b} and d={c\over d}.
however the most important knowledge here is a completely different way of writing linear functions:
a linear function that depicts a plane in 3D or line in 2D is a function in 2 respectively 1 variables.
the surrounding space however has one additional direction, so there is a line perpendicular to the function.
for 2D this perpendicular line is determined by \begin{pmatrix} y \\ \text{-}x \end{pmatrix} for a vector \begin{pmatrix} x \\ y \end{pmatrix}.
in 3D there’s an operator commonly written as “\times“, the “outer product” is how I learned it’s called.
just take any 2 vectors of the plane, apply that operator and you get the perpendicular vector.
\begin{pmatrix} a_x \\ a_y \\ a_z \end{pmatrix}\times\begin{pmatrix} b_x \\ b_y \\ b_z \end{pmatrix}=\begin{pmatrix} {\det\begin{pmatrix} a_y & b_y \\ a_z & b_z \end{pmatrix}} \\ {\det\begin{pmatrix} a_z & b_z \\ a_x & b_x \end{pmatrix}} \\ {\det\begin{pmatrix} a_x & b_x \\ a_y & b_y \end{pmatrix}} \end{pmatrix} with \det \begin{pmatrix} a_x & b_x \\ a_y & b_y \end{pmatrix}=a_x\cdot b_y-b_x\cdot a_y
once you have the perpendicular vector \vec q and a point on the surface/line \vec p, use the inner product (multiply each pair of components and sum them up) to create the formula:
\vec v\cdot\vec q=\vec p\cdot\vec q where \vec v=\begin{pmatrix} x \\ y \\ z \end{pmatrix} respectively \vec v=\begin{pmatrix} x \\ y \end{pmatrix}
when p is zero, it is obvious why this works: the inner product is zero when both vectors are perpendicular to eachother.
but more generally \vec a\cdot \vec b=\|\vec a\|\cdot\|\vec b\|\cdot\cos\angle(\vec a,\vec b)

when you look at a plane with a point p which isn’t the point closest to zero, then there will be a whole circle of points with exactly the same distance to zero.
also this whole circle will always have the same angle to the orthogonal vector.
so in this formula the distances are all the same as well as the \cos, throughout the whole circle.
therefore the inner product is constant on that circle.
remember, \vec a\cdot\vec b=\sum\limits_{k=1}^{\dim a}a_k\cdot b_k is linear in each component, multilinear!
since the equation \vec v\cdot\vec q=\vec p\cdot\vec q is linear and there are more than 2 points of the plane fulfilling it, that’s also the equation for all the other points.
i.e. as \cos\angle(\vec v,\vec q) increases, the distance decreases to compensate that. and the other way around.

for example we get \begin{pmatrix} x \\ y \end{pmatrix}\cdot\begin{pmatrix} \text{-}f'(x_0) \\ 1 \end{pmatrix}=\begin{pmatrix} x_0 \\ f(x_0) \end{pmatrix}\cdot\begin{pmatrix} \text{-}f'(x_0) \\ 1 \end{pmatrix} is tangential to the function in x_0.
subtract the x-term: y=f'(x_0)\cdot x+(f(x_0)-x_0\cdot f'(x_0))

there are many applications for this. most prominently you can figure out minimum and maximum of a function.
it wont tell you where those are, but definitely shrinks down the set of candidates to several points.
minimum and maximum can only happen at the border, or when there is no derivative at all, or when the derivative is zero.
derivative says something about the growth of a function. before it will start to grow or before it will start to decline, the function will go through a point where growth is zero.
alternatively the growth might make a jump, from one value to another. in the point where it jumps, the derivative does not exist, that’s a candidate for minimum or maximum too.
in a 3D function, with 2 axes reserved for domain, the edge is again a 2D function. there could be a maximum or minimum anywhere, you’d have to calculate this derivative (on the edge) to know where.

this was, in a nutshell what I learned in middle-school and only grasped at uni.
I didn’t talk of probability and integration (the inverse function to derivatives).
those I didn’t learn before uni, not from my teachers. it’s in the math-books though.
but based on above, it shouldn’t be too difficult.

if you’re an advanced mathematician, it is interesting to see here how little people in middle-school learn.
if you haven’t finished middle-school yet, or are re-learning all that stuff, good luck.
all those things are not difficult, although I might have made things seem difficult.
my goal with this posting was to show what abstract concepts there already are at middle-school.
and I wanted to point out how big an advantage it was to think abstractly in middle-school already.
as always I’m open to any critique or suggestions…