Julia: Tutorial & Code-Collection¶

Source: https://github.com/markomlikota/CodingSoftware

MIT License, © Marko Mlikota, https://markomlikota.github.io

Exercises¶

1 Introduction¶

2 Basics & Scalars¶

Exercise 2.a.i¶

Compute $33/19$ and round the result to 2 digits.

In [58]:
# Solution:
round(33/19, digits=2)
# or:
a = 33/19
round(a, digits=2)
Out[58]:
1.74

Exercise 2.a.ii¶

Define $a = 3.5$, $b = Inf$ and $c = NaN$. Check for each object whether it's finite.

In [59]:
# Solution:
a = 3.5
b = Inf
c = NaN
isfinite(a) 
isfinite(b)
isfinite(c)
Out[59]:
false

Exercise 2.b.i¶

Create the complex number $z = 9 + 3im$. Store the real part and calculate it's square root

In [60]:
# Solution:
z = 9 + 3im
realZ = real(z)
sqrt(realZ)
Out[60]:
3.0

Exercise 2.b.ii¶

Write a boolean expression to test that $5$ is strictly between $3$ and $9$

In [61]:
# Solution:
x = 5
isxInRange = (x > 3) && (x < 9)
Out[61]:
true

Exercise 2.b.iii¶

Extend the previous exercise: add another condition to check that $5$ is also not equal to $7$.

In [62]:
# Solution:
isxValid = isxInRange && (5 !=7)
Out[62]:
true

Exercise 2.b.iv¶

Convert the string "33//19" to a rational number and add $1//4$ to it. Now convert the result to a floating number rounded to $3$ digits. Finally, create a sentence that prints your result, e.g. "The result is x".

In [63]:
# Solution:
s = "33//19"
sR = parse(Rational{Int}, "33//19") + 1//4
sF = round(Float64(sR); digits=3)
println("The result is $sF.")
The result is 1.987.

Exercise 2.b.v¶

Install the package "Distributions".

In [65]:
# Solution:
using Pkg
Pkg.add("Distributions")
   Resolving package versions...
  No Changes to `/opt/conda/julia/environments/v1.10/Project.toml`
  No Changes to `/opt/conda/julia/environments/v1.10/Manifest.toml`
Precompiling project...
  ? Symbolics
  ? Symbolics → SymbolicsSymPyExt
  ? Symbolics → SymbolicsPreallocationToolsExt

3 Functions¶

Exercise 3.a.i¶

Write a function $fAfterInc(x)$ that increases a number by $15\%$ and returns the result rounded. Test it for $x = 100$.

In [66]:
# Solution:
fAfterInc(x) = x * 1.15
println("The full price is ", round(fAfterInc(100)))
The full price is 115.0

Exercise 3.a.ii¶

Write a function $fProfit(revenue, cost)$ that returns both the profit ($revenue – cost$) and the profit margin ($profit / revenue$). Apply for revenue of $600$ and cost of $450$.

In [67]:
# Solution:
function fProfit(rev,cost)
    p = rev - cost
    m = p/rev
    return p,m
end
p,m = fProfit(600,450)
println("Profit $p, Profit margin $m")
Profit 150, Profit margin 0.25

Exercise 3.a.iii¶

Write a function $fOut(y)$ that returns another function $fInn(x)$. The inner function should check if $x$ is bigger than $y$. Then apply $fOut$ at $y=5$ and apply the obtained $fInn$ to $x = 3$ and $x = 6$.

In [68]:
# Solution:
function fOut(y)
    function fInn(x)
        return x > y
    end 
    return fInn
end
fInn = fOut(5)
println(fInn(3))
println(fInn(6))
false
true

4 Vectors¶

Exercise 4.a.i¶

Create a vector $b$ that goes from $4$ to $44$ in steps of $8$.

In [69]:
# Solution: 
vb = collect(4:8:44)
Out[69]:
6-element Vector{Int64}:
  4
 12
 20
 28
 36
 44

Exercise 4.a.ii¶

Now multiply each element of vector $b$ by $3$ and store in a new vector.

In [70]:
# Solution: 
vb3 = vb .* 3
Out[70]:
6-element Vector{Int64}:
  12
  36
  60
  84
 108
 132

Exercise 4.b.i¶

Let $a = (2, 4, -8, 9, -3, 8, 43, 4)'$ be a $8 \times 1$-vector. Find the largest and smallest elements in $a$ without using maximum() or minimum(), but by using max() and min().

In [71]:
# Solution: 
va = [2,4,-8,9,-3,8,43,4]
max(va...) # is same as maximum(va) 
min(va...) # is same as minimum(va)
Out[71]:
-8

Exercise 4.b.ii¶

Let $b = (1, -4, 5, -9, 3, -8, 14, -4)$ be a vector. Define a function $fAbs(x)$ that returns absolute value of $x$ and then compute the sum of the absolute values of all elements in $b$.

In [72]:
# Solution: 
vb = [1, -4, 5, -9, 3, -8, 14, -4]
fAbs(x) = abs(x)
total = mapreduce(fAbs,+,vb)
println("Sum of absolute values = ", total)
Sum of absolute values = 48

Exercise 4.c.i¶

Define the vector $c = (4, 5, 2, 0, 8,10 ,-5, 16, 9, 15)$. Print out the first, last and every third element.

In [73]:
# Solution: 
vc = [4, 5, 2, 0, 8,10 ,-5, 16, 9, 15]
vcFirst = vc[1]
vcLast =vc[end]
vcStep3 = vc[1:3:length(vc)]
println(vcFirst, ", ", vcLast, ", ", vcStep3)
4, 15, [4, 0, -5, 15]

Exercise 4.c.ii¶

Now change the first element of $c$ to $10$, insert number $99$ in second position and multiply the first two element by $2$. Print the modified vector $c$.

In [74]:
# Solution: 
vc[1] = 10
insert!(vc,2,99)
vc[1:2] .*= 2
println("vc = ", vc)
vc = [20, 198, 5, 2, 0, 8, 10, -5, 16, 9, 15]

Exercise 4.c.iii¶

Given the student grades $(45, 72, 88, 55, 39, 91, 60)$, print the grades from highest to lowest and then print only the passing grades ($>=60$)

In [75]:
# Solution: 
vGrades = [45, 72, 88, 55, 39, 91, 60]
sortedGrades = sort(vGrades, rev=true)
println("Sorted grades =", sortedGrades)
passGrades = filter(x-> x>=60, vGrades)
println("Passing grades = ", passGrades)
Sorted grades =[91, 88, 72, 60, 55, 45, 39]
Passing grades = [72, 88, 91, 60]

Exercise 4.d.i¶

Create a vector of vectors $vva = ((1,2),(3,4))$. Make a copy $vvaa$ of it. Now, change the second element of second subvector in $vva$ to 50 and print $vvaa$. What do you observe? How can this be avioded?

In [76]:
# Solution:
vva = [[1,2],[3,4]]
vvaa = copy(vva)
vva[2][2] = 50
println("vvaa after modifying vva = ", vvaa)
# To avoid this, we use "deepcopy":
vva = [[1,2],[3,4]]
vvaa = deepcopy(vva)
vva[2][2] = 50
println("vvaa with deepcopy after modifying vva = ", vvaa)
vvaa after modifying vva = [[1, 2], [3, 50]]
vvaa with deepcopy after modifying vva = [[1, 2], [3, 4]]

Exercise 4.d.ii¶

Create a vector $b = (2, 4, 6)$. Define $bsq$ as the square of each element in $b$. Change an element of $b$ and then print $bsq$. What do you observe?

In [77]:
# Solution:
vb = [2, 4, 6]
vbsq = vb .^ 2   
println("b = ", vb, ", bsq = ", vbsq)
vb[2] = 100
println("b = ", vb, ", bsq = ", vbsq)   
# bsq does not change with changes in b. No copy is needed.
b = [2, 4, 6], bsq = [4, 16, 36]
b = [2, 100, 6], bsq = [4, 16, 36]

5 Matrices & Arrays¶

Exercise 5.a.i¶

Combine the vectors $b$ and $bsq$ from previous exercise into matrix $Z$. Replace the second column of $Z$ to take values $1$,$2$ and $3$.

In [78]:
# Solution:
mZ = [vb vbsq]
mZ[:,2] = 1:3
Out[78]:
1:3

Exercise 5.a.ii¶

Create the matrix $$ X = \begin{bmatrix} 3 & 6 & 9 \\ 4 & 8 & 19 \\ 5 & 10 & 15 \end{bmatrix} \; .$$ Find all positions of elements greater than $6$.

In [79]:
# Solution:
mX = [3 6 9; 4 8 12; 5 10 15]
findall(x -> x > 6, mX)
Out[79]:
5-element Vector{CartesianIndex{2}}:
 CartesianIndex(2, 2)
 CartesianIndex(3, 2)
 CartesianIndex(1, 3)
 CartesianIndex(2, 3)
 CartesianIndex(3, 3)

Exercise 5.b.i¶

You are given the matrix $$ Y = \begin{bmatrix} 2 & 1 & 0 \\ 1 & 2 & 1 \\ 0 & 1 & 2 \end{bmatrix} \; .$$ Identify the factors by which the matrix $Y$ stretches in space (eigenvalues), and the directions in which this stretching happens (eigenvectors).

In [80]:
# Solution:
mY = [2 1 0; 1 2 1; 0 1 2]
using LinearAlgebra

eValues = eigvals(mY)
eVec = eigvecs(mY)

println("Stretching factors (eigenvalues) = ", eValues)
println("Directions (eigenvectors) =\n", eVec)
Stretching factors (eigenvalues) = [0.585786437626905, 1.9999999999999998, 3.414213562373095]
Directions (eigenvectors) =
[-0.5000000000000009 -0.7071067811865468 0.5000000000000001; 0.7071067811865476 -1.0990647210786414e-15 0.7071067811865475; -0.4999999999999991 0.7071067811865482 0.4999999999999999]

Exercise 5.b.ii¶

Take the vector $S = (2, 5, 1, 4, 3, 6, 7, 0, 8)$ and transform it into a 3×3 matrix $mS$; compute the column-wise cumulative sum, then find the mean of the lower triangular part of $mS$ (including the diagonal).

In [82]:
# Solution:
vS = [2, 5, 1, 4, 3, 6, 7, 0, 8]
using LinearAlgebra
using Statistics
mS = reshape(vS, 3, 3)

mSCumsum = cumsum(mS, dims=1)
mSLowtri = mean(tril(mS)) 

println("M =\n", mS)
println("cumsum by columns =\n", mSCumsum)
println("Mean of lower triangular = ", round(mSLowtri))
M =
[2 4 7; 5 3 0; 1 6 8]
cumsum by columns =
[2 4 7; 7 7 7; 8 13 15]
Mean of lower triangular = 3.0

Exercise 5.b.iii¶

The matrix below shows production costs over three months (rows) for two goods (columns): $$ Cost = \begin{bmatrix} 10 & 12 \\ 11 & 15 \\ 9 & 14 \end{bmatrix} \; . $$ How much does the cost of each good vary over time? Which good shows more fluctuation in costs?

In [83]:
# Solution:
mCost = [10 12; 11 15; 9 14]
using Distributions
mCostvar = var(mCost, dims=1) 
mCostvar #Good 2 higher variance
Out[83]:
1×2 Matrix{Float64}:
 1.0  2.33333

Exercise 5.b.vi¶

Continuing the previous exercise: how do the costs of the two goods move together, and how closely related are they?

In [84]:
# Solution:
mCostcovar = cov(mCost)
mCostcor = cor(mCost)
Out[84]:
2×2 Matrix{Float64}:
 1.0       0.327327
 0.327327  1.0

6 Conditionals & Loops¶

Exercise 6.a.i¶

Given vector $a = (1, 4, 5, 6, 10)$. Compute it's mean and if the mean is greater than $5$, print "High average", else print "Low average".

In [85]:
# Solution:
va = (1, 4, 5, 6, 10)
vamean = mean(va)

if vamean > 5
    println("High average")
else
    println("Low average")
end
#or 
println(vamean > 5 ? "High average" : "Low average")
High average
High average

Exercise 6.a.ii¶

Create the matrix $$ Q = \begin{bmatrix} 1 & 0 & 3 \\ 8 & 5 & 6 \\ 9 & 1 & 9 \end{bmatrix} \; .$$ Use a nested for-loop to go through each row and column, print the position (row, column) and the element, and then also print its square.

In [86]:
# Solution:
mQ = [1 2 3; 4 5 6; 7 8 9]

for i = 1:size(mQ ,1)          
    for j = 1:size(mQ ,2)      
        val = mQ[i,j]
        println("Element at (", i, ",", j, ") = ", val, 
                ", squared = ", val^2)
    end
end
Element at (1,1) = 1, squared = 1
Element at (1,2) = 2, squared = 4
Element at (1,3) = 3, squared = 9
Element at (2,1) = 4, squared = 16
Element at (2,2) = 5, squared = 25
Element at (2,3) = 6, squared = 36
Element at (3,1) = 7, squared = 49
Element at (3,2) = 8, squared = 64
Element at (3,3) = 9, squared = 81

Exercise 6.a.iii¶

Define the function $f(x) = x^2 - 10$. Start with $x=1$ and repeatedly increase $x$ by $1$ until $f(x)$ becomes larger than $50$. Print the value of $x$ when this condition is first satisfied.

In [87]:
# Solution:
f(x) = x^2 - 10
x = 1
while f(x) <= 50
    println("x = ", x, ", f(x) = ", f(x))
    x += 1
end
println("Stopped at x = ", x, " because f(x) > 50")
x = 1, f(x) = -9
x = 2, f(x) = -6
x = 3, f(x) = -1
x = 4, f(x) = 6
x = 5, f(x) = 15
x = 6, f(x) = 26
x = 7, f(x) = 39
Stopped at x = 8 because f(x) > 50

7 Object-Scopes¶

Exercise 7.a.i¶

Define $s = 0$. Write a for loop that adds the numbers $1:5$ to $s$. Now write a function $fadd()$ that i) takes no arguments, ii) performs a similar loop, adding $1:7$ to $s$, and iii) returns $s$. Execute the function. What happens? Does $s$ change outside the function?

In [88]:
# Solution:
s = 0
for i in 1:5
    s += i
end
println("Global loop result = ", s)   # 15

function fadd()
    for i in 1:7
        s += i   # tries to use local s, not the global one
    end
   return s
end
fadd()
println("After function, global s is still = ", s)  # still 15, is unchanged
Global loop result = 15
UndefVarError: `s` not defined

Stacktrace:
 [1] fadd()
   @ Main ./In[88]:10
 [2] top-level scope
   @ In[88]:14

Exercise 7.a.ii¶

Adjust the function from the previous exercise so that it indeed returns the sum of $1:7$.

In [90]:
# Solution:
s = 0
function fadd()
    global s
    for i in 1:7
        s += i
    end
    return s
end
fadd()
println("After declaring it as global, s = ", s)
#Or you can define a new local variable without changing s:
function faddLocal()
    total = 0
    for i in 1:7
        total += i
    end
    return total
end
faddLocal()
println("Alternatively, after adding a new local, local variable = ", faddLocal())
After declaring it as global, s = 28
Alternatively, after adding a new local, local variable = 28

8 Further Object-Types¶

Exercise 8.a.i¶

Write a Julia-expression for Utility $U(x,y) = x^{0.5} y^{0.5}$. Evalutate it for $x = 4$ and $y = 9$

In [91]:
# Solution:
exUtil = :(x^0.5 * y^0.5)
x=4; y=9
println("Utility = ",  eval(exUtil))
Utility = 6.0

Exercise 8.a.ii¶

Create a vector of symbols $(:a, :b, :v)$. Write a loop that builds the expression $a1 + b2 + c3$ using the symbols and their index numbers. Lastly evaluate the expression for $a = 2, b = 3, c = 4$.

In [92]:
# Solution:

vSyms = [:a, :b, :c] 

expr = :(0)

for i = 1:length(vSyms)
    expr = :( $expr + $(vSyms[i]) * $i )
end

println(expr)

a = 2; b = 3; c = 4
println("Result = ", eval(expr))
((0 + a * 1) + b * 2) + c * 3
Result = 20

Exercise 8.a.iii¶

You are given a tuple representing a point in two-dimensional space $P = (3, 4)$. Extract the $x$ and $y$ coordinates from the tuple and compute the distance of the point from the origin $(0, 0)$. Try to change the x coordinate inside the tuple to see what happens.

In [93]:
# Solution:

tP = (3, 4)

x = tP[1]
y = tP[2]

xydist = sqrt(x^2 + y^2)
println("Distance from origin = ", xydist)

tP[1] = 5   # Error because you cannot change tuple but can overwrite the object.
Distance from origin = 5.0
MethodError: no method matching setindex!(::Tuple{Int64, Int64}, ::Int64, ::Int64)

Stacktrace:
 [1] top-level scope
   @ In[93]:11

Exercise 8.a.vi¶

Create a Dataframe with the given data $$\begin{array}{|c|c|c|c|} \hline \textbf{Year} & \textbf{GDP} & \textbf{Inflation} & \textbf{Unemployment} \\ \hline 2020 & 500 & 2.1 & 5.0 \\ 2021 & 550 & 1.8 & 4.7 \\ 2022 & 600 & 2.5 & 4.5 \\ \hline \end{array} $$

Print the Dataframe. Extract the column GDP and calculate it's average.

In [94]:
# Solution:

using DataFrames

dfMacro = DataFrame(Year = Int[], GDP = Float64[], Inflation = Float64[], Unemployment = Float64[])
push!(dfMacro, (2020, 500.0, 2.1, 5.0))
push!(dfMacro, (2021, 550.0, 1.8, 4.7))
push!(dfMacro, (2022, 600.0, 2.5, 4.5))

println(dfMacro)

avggdp = mean(dfMacro[1:3,2]) # or
avggdp = mean(dfMacro.GDP)

println("Average GDP = ", avggdp)
3×4 DataFrame
 Row │ Year   GDP      Inflation  Unemployment 
     │ Int64  Float64  Float64    Float64      
─────┼─────────────────────────────────────────
   1 │  2020    500.0        2.1           5.0
   2 │  2021    550.0        1.8           4.7
   3 │  2022    600.0        2.5           4.5
Average GDP = 550.0

9 Path-, Folder- & Workspace-Management¶

Exercise 9.a.i¶

Create a folder called "Inputs" and, in there, write a file called params.jl, which defines: $\mu = 50$ (mean), $\sigma = 10$ (std. deviation).

In your main/current file, load params.jl, define a function $fZscore(x) = (x - \mu)/\sigma$ that uses the parameters from params.jl, and print the result for $x = 70$.

In [97]:
# Solution:
sPathMain            = "/files/Julia/TutorialCodecollection/" # (insert your own path here)
sPathInput          = sPathMain * "/Input/"

try
 mkdir(sPathInput) # create folder "Input" (can also do manually, using the folder-system on your computer)
 catch
end

# write file.. typically done manually, but note that one can also write a new file using Julia:
cd(sPathInput)
write("params.jl", "μ = 50 \n σ = 10 \n") 

include(sPathInput * "params.jl")

fZscore(x) = (x - μ) / σ
println("zscore(70) = ", fZscore(70))
zscore(70) = 2.0

Exercise 9.a.ii¶

Create a new subfolder called "Functions" inside your main working directory. Inside this folder, manually create a file called area_circle.jl that defines the following function: $fAreacircle(x) = \pi r^2$.

Then, in your main/current file, load area_circle.jl and print the results of the function with $r = 5$.

In [100]:
# Solution:
sPathMain            = "/files/Julia/TutorialCodecollection/" # (insert your own path here)
sPathFunctions       = sPathMain * "Functions/"

try
 mkdir(sPathFunctions)
 catch
end

# write file.. typically done manually, but note that one can also write a new file using Julia:
cd(sPathFunctions)
write("area_circle.jl", "fAreacircle(r) = pi * r^2 \n") 

include(sPathFunctions * "area_circle.jl")

println("Area of circle with r=5 = ", fAreacircle(5))
Area of circle with r=5 = 78.53981633974483

10 Storing & Loading Data¶

Exercise 10.a.i¶

Define the vector $c = [3,4,6]$ and store it as a .txt-file in your current working directory. Then, load the data from this .txt-file to a vector $d$. Make sure your vector $d$ is actually a vector and not a matrix.

In [28]:
# Solution:

sPathMain            = "/files/Julia/TutorialCodecollection/" # (insert your own path here)
cd(sPathMain)

using DelimitedFiles

vc = [3,4,6]

vcFilename = "myamazingfile.txt"
writedlm(vcFilename, vc)

vd = readdlm(vcFilename)
vd = vec(vd)
Out[28]:
3-element Vector{Float64}:
 3.0
 4.0
 6.0

Exercise 10.a.ii¶

You are given GDP data $$ \begin{array}{|c|c|} \hline \textbf{Year} & \textbf{GDP} \\ \hline 2020 & 500 \\ 2021 & 550 \\ 2022 & 600 \\ 2023 & 520 \\ \hline \end{array} $$

Create a dataframe 'dfgdp' and save it as gdp.csv file in your current directory. Then reload the again into a new dataframe and caculate the GDP growth from $2020$ to $2023$.

In [29]:
# Solution:

sPathMain            = "/files/Julia/TutorialCodecollection/" # (insert your own path here)
cd(sPathMain)

using DataFrames
using CSV

dfgdp = DataFrame(Year = Int[], GDP = Float64[])
push!(dfgdp, (2020, 500))
push!(dfgdp, (2021, 550))
push!(dfgdp, (2022, 600))
push!(dfgdp, (2022, 520))
# or
#dfgdp = DataFrame(Year=[2020, 2021, 2022, 2023], GDP=[500, 550, 600, 520])

CSV.write("gdp.csv", dfgdp)

dfgdpLoad = CSV.read("gdp.csv", DataFrame)

gdpGrowth = (dfgdpLoad.GDP[end] - dfgdpLoad.GDP[1]) / dfgdpLoad.GDP[1] * 100
println("GDP growth 2020–2023 = ", gdpGrowth, "%")
GDP growth 2020–2023 = 4.0%

11 Random Variables¶

Exercise 11.a.i¶

Draw $1000$ random numbers from the standard normal distribution, i.e. $N \sim \mathcal{N}(0,1)$. Store them in a vector $x$. Next, compute the mean and standard deviation of your draws.

In [1]:
# Solution:

using Distributions

vx = rand(Normal(0, 1), 1000) 

vxMean = mean(vx)
vxSd   = std(vx)

println("Sample mean  = ", vxMean)
println("Sample sd    = ", vxSd)
println("For population: mean=0, sd=1")
Sample mean  = -0.009850913989838282
Sample sd    = 0.999623762480373
For population: mean=0, sd=1

Exercise 11.a.ii¶

Generate a 4×4 matrix $R$ of Uniform random numbers with range $[0,1]$. Replace all entries greater than $0.5$ with $1$, and the rest with $0$ and print the resulting matrix.

In [2]:
# Solution:

using Random
using Statistics

a = MersenneTwister(1234)
rand(a)

mR = rand(4, 4)          
mRbinary = (mR .> 0.5) * 1

println("Original Matrix:\n", mR)
println("New Matrix:\n", mRbinary)
Original Matrix:
[0.8035878223436029 0.3279636855855226 0.6687456570675097 0.06382378865996918; 0.21115130903658963 0.9780140661279868 0.8594290820866141 0.4375421724241839; 0.9684974797648702 0.9284387257737836 0.9162628834915212 0.6834579476131102; 0.3602053746282866 0.9567068093319756 0.2067134684642914 0.44682561696966117]
New Matrix:
[1 0 1 0; 0 1 1 0; 1 1 1 1; 0 1 0 0]

Exercise 11.a.iii¶

Consider a Binomial distribution $B(n=10, p =0.3)$. Using pdf and cdf, compute the probability of exactly 4 successes and the probability of at most 4 successes. Lastly, set a loop to calculate and print the probabilities of getting $0, 1, 2, …, 10$ successes $$P(X = k), \quad k = 0,1,2,\dots,10.$$

In [3]:
using Distributions

dbinom = Binomial(10, 0.3)

pOf4 = pdf(dbinom, 4)
pLess4 = cdf(dbinom, 4)

println("P(X=4)=", pOf4)
println("P(X<4)=", pLess4 )

println("P(X=k) for k = 0:10")

for k in 0:10
    println("k = ", k, " - ", pdf(dbinom, k))
end
P(X=4)=0.20012094900000013
P(X<4)=0.8497316674000002
P(X=k) for k = 0:10
k = 0 - 0.028247524900000005
k = 1 - 0.12106082100000018
k = 2 - 0.23347444049999988
k = 3 - 0.2668279320000006
k = 4 - 0.20012094900000013
k = 5 - 0.10291934520000003
k = 6 - 0.03675690899999997
k = 7 - 0.009001692000000018
k = 8 - 0.0014467004999999982
k = 9 - 0.00013778100000000015
k = 10 - 5.904899999999995e-6

Exercise 11.a.vi¶

You are given a bivariate normal with mean vector $v\mu = (0,0)$ and $ m\Sigma = \begin{bmatrix}1 & 0.5 \\ 0.5 & 1 \end{bmatrix} $. Fix a Random.seed so your results are reproducible. Definie the distribution and then draw $5$ random vectors, storing them in a matrix where each row is one draw. Using this sample, compute the $2.5\%$ and $97.5\%$ quantiles for each column.

In [4]:
# Solution:

using Distributions, Statistics
Random.seed!(123)

vμ = [0.0, 0.0]
mΣ = [1.0 0.5; 0.5 1.0]
dnorm = MvNormal(vμ, mΣ)

mnorm = rand(dnorm, 5)'
mnorm 

q1 = quantile(mnorm[:,1], [0.025, 0.975])
q2 = quantile(mnorm[:,2], [0.025, 0.975])

println("Quantiles for column 1 = ", q1)
println("Quantiles for column 2 = ", q2)
Quantiles for column 1 = [-1.102994343489995, 0.7562179336808559]
Quantiles for column 2 = [-1.337718649125717, 0.31507687210983903]

Exercise 11.a.v¶

A lottery has 6 prizes $(3, 6, 9, 12, 15, 18)$ with probabilities $(0.1, 0.05, 0.3, 0.2, 0.1, 0.25)$. Use Categorical() to simulate one random prize.

In [6]:
# Solution:

using Distributions
using StatsBase

vProbs = [0.1, 0.05, 0.3, 0.2, 0.1, 0.25]
vValues = [3, 6, 9, 12, 15, 18]

prize = vValues[rand(Categorical(vProbs))]
println("Random prize = ", prize)
Random prize = 9

12 Parallelization¶

Exercise 12.a.i¶

Given the function $f(k) = \sum_{i=1}^{50000} \frac{1}{i+k}$. Define the function as $fMyFun(k)$ that returns their sum. For $M = 1000$, first create a SharedArray and fill it using a @distributed loop. Next craete a normal array and fill it using a regular for loop. Now comparing the runtimes of the two versions, what do you observe?

In [7]:
# Solution:

using Distributed
addprocs(4)

@everywhere using SharedArrays 

@everywhere function fMyFun(k)
   return sum(i -> 1/(i + k), 1:50000)
end

M = 1000

@time begin
    vParallel = SharedArray{Float64}(M)
    @distributed for k in 1:M
        vParallel[k] = fMyFun(k)
    end
end

@time begin
    vNormal = zeros(Float64, M)
    for k in 1:M
        vNormal[k] = fMyFun(k)
    end
end
  2.665802 seconds (1.28 M allocations: 87.590 MiB, 1.37% gc time, 248.62% compilation time: 3% of which was recompilation)
  0.089447 seconds (35.07 k allocations: 2.312 MiB, 59.31% compilation time)

Exercise 12.a.ii¶

Simulate incomes for $100,000$ people, where each income is drawn from a $LogNormal(10, 0.5)$ distribution. Then use a multicore loop to draw the samples and fill in a SharedArray. Finally, compute and compare the average income from your simulation with the theoretical mean. Hint: The theoretical mean of a $LogN(\mu,\sigma)$ is $e^{\mu + \tfrac{\sigma^2}{2}}$

In [8]:
# Solution:

using Distributed
addprocs(4)

@everywhere using Distributions
@everywhere using SharedArrays

M = 100000
vInc = SharedArray{Float64}(M)

@distributed for i in 1:M
    dlnorm = LogNormal(10, 0.5)
    vInc[i] = rand(dlnorm)
end

theoryMean = exp(10 + 0.5^2 / 2)

println("Simulated mean = ", round(mean(vInc), digits=2))
println("Theoretical mean = ", round(theoryMean, digits=2))
Simulated mean = 0.0
Theoretical mean = 24959.26

13 Plotting¶

Exercise 13.a.i¶

Imagine an economy where GDP grows at a constant rate of $3\%$ per period i.e. $GDP_t=100×(1+0.03)^t$ for $t=1,2,…,30$. Use a loop to generate the GDP values for $30$ periods and then plot the GDP path.

In [9]:
# Solution:

using Plots

time = 30
vgdp = Float64[]
for t in 0:time
    push!(vgdp, 100 * (1 + 0.03)^t)
end

plot(0:time, vgdp, line = (:red, 1, 2, :dash), xlabel = "Time", ylabel = "GDP", title = "Plot of \$ GDP_t = 100(1+0.03)^t \$", legend = false)
Out[9]:

Exercise 13.a.ii¶

Now suppose the GDP values from Exercise 1 correspond to monthly observations starting from 1 January 2021. Create vectors for the corresponding years and months (e.g. 2021 M1, 2021 M2, …).

Hint: you can build the labels by looping over months and resetting to M1 after M12. Lasltly, plot the GDP path against these monthly labels instead of $t$ periods.

In [10]:
# Solution:

using Plots

T = 31
time = 1:T

vyears  = Int[]
vmonths = Int[]
y, m = 2021, 1

for i in 1:T
    push!(vyears, y)
    push!(vmonths, m)
    m += 1
    if m > 12
        m = 1
        y += 1
    end
end 

labels   = ["$(vyears[i]) M$(vmonths[i])" for i in 1:T]
myTicks = 1:4:T                        

plot(time, vgdp; line = (:red, 1, 2, :dash), xlabel = "Month", ylabel = "GDP", title  = "Plot of \$ GDP_t = 100(1+0.03)^t \$", xticks = (myTicks, labels[myTicks]), legend = false)
Out[10]:

Exercise 13.b.i¶

You want to compare different GDP paths. Start by writing a function $fMyEmptyGdp()$ which creates an empty plot with axis labels, title, etc. Use a loop to plot several GDP line paths (e.g.growth rates of $2\%$, $3\%$, and $5\%$) on the same figure.

In [11]:
# Solution:

using Plots

fMyEmptyGdp() = plot([0,0],[NaN,NaN]; label="", grid=true, xlabel="Time", ylabel="GDP", title="Plot of \$ GDP_t = 100*(1+g)^t \$", xticks = (myTicks, labels[myTicks]))

T = 31
time = 1:T
vgrowthrates = [0.02, 0.03, 0.05]     
colors = [:blue, :red, :seagreen]


p = fMyEmptyGdp()
for (i, g) in enumerate(vgrowthrates)
    GDPcalc = 100 .* ((1 .+ g).^time )                
    plot!(p, time, GDPcalc; label="g = $(round(g*100; digits=1))%", line=(colors[i], 1, 2, :solid))
end
p
Out[11]:

Exercise 13.b.ii¶

Suppose a teacher collects data on $30$ students' weekly study hours and exam grades. Generate random data for study hours between 1 and 20. For exam grades assume they rise with study hours plus a little random variation (e.g. $examGrades = 40 + 2 \times studyHours + rand(-5,5,N)$). Make a scatter plot of exam grades (y-axis) against study hours (x-axis).

In [12]:
# Solution:

using Random, Plots

Random.seed!(123)
Nofstud = 30

vstudyHours = rand(1:20, Nofstud)                
vexamGrades = 40 .+ 2 .* vstudyHours .+ rand(-5:5, Nofstud)  

scatter(vstudyHours, vexamGrades; xlabel = "Study Hours", ylabel = "Exam Grades", title  = "Study Hours vs Exam Grades", legend = false,
        marker = :circle)
Out[12]:

Exercise 13.c.i¶

Simulate 100 heights (in cm) that follows a normal distribution ~ $N(170,10)$. Plot a histogram (as a density) and overlay a kernel density curve.

In [16]:
# Solution:

using Random, Distributions, Plots, StatsBase, StatsPlots

Random.seed!(123)
N = 100

dheights = rand(Normal(170, 10), N)

histogram(dheights; normalize=:pdf, bins=15, xlabel="Height (cm)", ylabel="Density", title="Distribution of Heights", alpha=0.5,label = "false")
density!(dheights; line=(2, :solid),label = "false")
Out[16]:

Exercise 13.c.ii¶

Simulate data for four countries over $t = 24$ periods using different random distributions (one distribution per country). Make one line plot for each country and combine them into a $2×2$ layout. (Hint: You can store the distributions and titles in vectors, and use a for loop to generate the data and plots.)

In [17]:
# Solution:

using Random, Distributions, Plots

Random.seed!(123)
T = 24
months = 1:T

plots = []  

for i in 1:4
    if i == 1
        data = rand(Normal(100, 5), T)
        title = "UK"
    elseif i == 2
        data = rand(Uniform(90, 110), T)
        title = "US"
    elseif i == 3
        data = 100 .+ rand(Exponential(1/100), T)
        title = "Switzerland"
    else
        data = rand(LogNormal(4.5, 0.1), T)
        title = "Germany"
    end

   p = plot(months, data; line=(2),  marker=:circle, title=title, legend=false)
   push!(plots, p)
end

plot(plots...; layout=(2,2),size=(900,600))
Out[17]:

Exercise 13.c.iii¶

A survey records the preferred study method for a small group of students: $$ \begin{array}{|c|c|c|c|} \hline \textbf{Gender} & \textbf{Group only} & \textbf{Individual only} & \textbf{Both} \\ \hline \text{Male} & 10 & 5 & 5 \\ \text{Female} & 6 & 8 & 6 \\ \hline \end{array} $$

Create a bar chart comparing male and female preferences side by side. Then make a pie chart showing the overall preference distribution (all students combined). Combine both plots in a $1×2$ layout.

In [18]:
# Solution:

using Plots

scategories = ["Group", "Individual", "Both"]


vmale   = [10, 5, 5]
vfemale = [6, 8, 6]

plot1 = groupedbar(scategories, [vmale vfemale], bar_position = :dodge,  xlabel = "Study Method",  ylabel = "Number of Students",  label = ["Male" "Female"],  
    title = "Preferences by Gender",legend = :topright )

totalstudents = vmale .+ vfemale
plot2 = pie(scategories, totalstudents, title = "Overall Preferences", colour = [:white, :black, :grey])

plot(plot1, plot2; layout = (1,2), size = (850, 400))
Out[18]:

14 Further¶

Exercise 14.a.i¶

Consider the matrix $A$ and vector $b$: $$ A = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 1 & 6 \\ 7 & 8 & 1 \end{bmatrix} \; , \quad b = \begin{bmatrix} 1 \\ 2 \\ 2 \end{bmatrix} \; . $$ Compare the computational time needed to solve the system of equations $Ax = b$ by typing A\b and by typing inv(A)*b.

In [19]:
# Solution:

mA = [1 2 3; 4 1 6; 7 8 1]
vb = [1,2,2]

@time mA\vb 
@time inv(mA)*vb
  0.397631 seconds (366.38 k allocations: 24.565 MiB, 99.96% compilation time)
  0.205967 seconds (227.86 k allocations: 15.332 MiB, 5.74% gc time, 99.87% compilation time)
Out[19]:
3-element Vector{Float64}:
 0.14423076923076922
 0.09615384615384619
 0.22115384615384612

Exercise 14.a.ii¶

Let $f(x) = (x-3)^2+5$. Find the value of $x$ that minimizes this function.

In [20]:
# Solution:

using Optim

f(x) = (x - 3)^2 + 5

optResult = optimize(f, 0.0, 10.0, GoldenSection())

xmin = Optim.minimizer(optResult)
fmin = Optim.minimum(optResult)

println("Minimum at x = ", round(xmin))
println("Minimum value = ", round(fmin))
Minimum at x = 3.0
Minimum value = 5.0

Exercise 14.a.iii¶

Let $g(x) = ln(x^2 +1)$. Compute the derivative at $x=1$. Then define a function $fMyGradient(x)$ that returns the gradient at any x. Plot both $f(x)$ and $f'(x)$ over the range $x \in[-2,2]$.

In [21]:
# Solution:

using Zygote
using Plots

g(x) = log(x^2 + 1)

dg1 = gradient(g, 1.0)[1]
println("g'(1) = ", dg1)

fMyGradient(x) = gradient(g, x)[1]

x = -2:0.01:2
plot(x, g, label="g(x) = log(x² + 1)", linewidth=2)
plot!(x, fMyGradient, label="g'(x)", linewidth=2, linestyle=:dash)
g'(1) = 1.0
Out[21]:

Exercise 14.a.iv¶

You are given $x = (1, 2, 3, 4)$ and $y = (1 , 4, 9, 16)$. Use linear interpolation to approximate the value of the function at $x = 2.5$. Now use the same object to extrapolate and approximate the value at $x = 5$.

In [23]:
# Solution:

using Interpolations
using Plots

vx = [1, 2, 3, 4]
vy = [1, 4, 9, 16]

itp = LinearInterpolation(vx, vy; extrapolation_bc=Line())

interp = itp(2.5)
extrap = itp(5.0)

scatter(vx, vy; label="Data", legend=:topleft)

plot!(1:0.01:5, itp.(1:0.01:5); label="Interpolated Line", lw=2, xlabel = "x", ylabel = "f(x)", title = " \$ y = x^2 \$ ")
scatter!([2.5], [interp]; label="Interpolated x=2.5", color=:green, marker=:utriangle)
scatter!([5.0], [extrap]; label="Extrapolated x=5", color=:red, marker=:diamond)
WARNING: using Interpolations.gradient in module Main conflicts with an existing identifier.
Out[23]:

Exercise 14.a.v¶

You are given $f(x,y) = x^2 + 3xy$. Find the partial derivatives and then evaluate $f$ at $x = 2$ and $y = 1$.

In [27]:
# Solution:

using Symbolics

@variables x y
fxy = x^2 + 3x*y

Diffx = Differential(x)
Diffy = Differential(y)

dfdx = expand_derivatives(Diffx(fxy))
dfdy = expand_derivatives(Diffy(fxy))

println("∂f/∂x = ", dfdx)
println("∂f/∂y = ", dfdy)

fxyval = substitute(fxy, Dict(x => 2, y => 1))
println("f(2, 1) = ", fxyval)
∂f/∂x = 2x + 3y
∂f/∂y = 3x
f(2, 1) = 10