class: center, middle # Julia Tutorial @ Home Office ## Part 1: Language Overview Michael Kraus NMPP Seminar 23.04.2020, 30.04.2020, 07.05.2020 --- layout: false .toc[ .three-column-one[ .highlighted[ #### Part 1: Language Overview * Introduction * Variables * Numbers * Data Structures * Functions * Control Flow * Types * Methods * Constructors * Modules * Scope of Variables * Package Management * Performance Tips * Style Guide ] #### Part 2: Advanced Julia * Parametric Types * Parametric Methods * Conversion and Promotion * Interfaces * Meta Programming ] .three-column-two[ #### Part 3: Introspection * Benchmarking * Code Introspection * More Performance Tips * Profiling * Debugging * Stack Traces #### Part 4: Design Patterns * Composition * Multiple Dispatch * Traits * Method Design * Maintainability Patterns * Robustness Patterns * Anti-Patterns ] .three-column-three[ #### Part 5: Package Development * Create a Package * Tests * Documentation * GitHub Actions * Travis CI * Code Coverage #### Part 6: Parallel Programming * Tasks * Threads * Distributed * SharedArrays * DistributedArrays * MPI #### Part 7: Useful Packages * Plotting * Numerics * Literate Programming * Automatic Differentiation * Language Interoperability ] ] --- class: center, middle # Introduction --- class: largelist .left-column[ ## Introduction ] .right-column[ * **high-level**, **high-performance** and **high-productivity** * facilitates development of general, modular, extendable and thus **reusable** code * encourages good software development practices: built-in tools for documentation, tests, version control * high-level syntax, just-in-time compilation, **multiple dispatch** and **abstraction layers** facilitate and simplify * targeted and incremental manual optimisations * advanced automatic optimisations by the compiler * programming of heterogeneous architectures * development of domain-specific languages * sophisticated techniques like automatic differentiation and differentiable programming ] --- .left-column[ ## Introduction ### Run Julia ] .right-column[ * full-featured interactive command-line REPL (read-eval-print loop) ``` $ julia _ _ _ _(_)_ | Documentation: https://docs.julialang.org (_) | (_) (_) | _ _ _| |_ __ _ | Type "?" for help. | | | | | | |/ _` | | | | |_| | | | (_| | | Version 1.4.1 (2020-04-14) _/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release |__/ | julia> ``` * scripts ```bash $ julia myscript.jl ``` * run locally: download from [julialang.org](https://julialang.org/downloads/) * run on clusters (MPCDF & TOK): ```bash $ module load julia/1.3.1 ``` * [MPCDF Jupyter Notebook Service](https://rvs.mpcdf.mpg.de/) & [GWDG Jupyter Cloud](http://jupyter-cloud.gwdg.de) (see [wiki](https://wiki.mpcdf.mpg.de/NMPP/index.php/Julia) for details) ] --- .left-column[ ## Introduction ### Run Julia ### Write Code ] .right-column[ * [Atom + Juno](https://junolab.org)
* [Visual Studio Code](https://www.julia-vscode.org)
* [Emacs](https://github.com/JuliaEditorSupport/julia-emacs) * [Vim](https://github.com/JuliaEditorSupport/julia-vim) * [Jupyter + IJulia](https://github.com/JuliaLang/IJulia.jl) ] --- .left-column[ ## Introduction ### Run Julia ### Write Code ### Miscellaneous ] .right-column[ * `println` writes the text representation of a variable to `stdout` followed by a newline ```julia julia> println("Hello World!") Hello World! ``` * functions are called using the traditional parenthesis syntax ```julia julia> cos(0.0) 1.0 ``` * the general syntax for indexing uses square brackets ```julia julia> x[1] = 1 ``` * macros are invoked with the syntax `@name expr1 expr2 ...` ```julia julia> @assert 1 == 0 Error: AssertionError: 1 == 0 ``` * Julia features interned strings referred to as symbols ```julia julia> :foo :foo julia> :foo == Symbol("foo") true julia> Symbol("func", 10) :func10 ``` ] --- .left-column[ ## Introduction ### Run Julia ### Write Code ### Miscellaneous ### Unicode ] .right-column[ * full unicode support * strings ```julia julia> println("Greetings. السلام عليكم. こんにちは。") Greetings. السلام عليكم. こんにちは。 ``` * variable, function, etc. names ```julia julia> 🐢 = "turtle" "turtle" julia> println(🐢) turtle ``` * syntax: LaTeX-like backslash notation with tab completion ``` julia> x\hat
\_1
= 2\pi
``` results in ```julia julia> x̂₁ = 2π 6.283185307179586 ``` ] --- .left-column[ ## Introduction ### Run Julia ### Write Code ### Miscellaneous ### Unicode ] .right-column[ * some infix operators and elementary functions | short | long | name | |:----- |:------- |:------------------------ | | `≠` | `!=` | inequality | | `≤` | `<=` | less than or equal to | | `≥` | `>=` | greater than or equal to | | `∈` | `in` | element of | | `∉` | `notin` | not in | | `÷` | `div` | truncated division | | `√` | `sqrt` | square root | | `∛` | `cbrt` | cubic root | * some more .small[ ~ ← → ↔ ↚ ↛ ↠ ↣ ↦ ↮ ⇎ ⇏ ⇒ ⇔ ⇴ ⇶ ⇷ ⇸ ⇹ ⇺ ⇻ ⇼ ⇽ ⇾ ⇿ ⟵ ⟶ ⟷ ⟷ ⟹ ⟺ ⟻ ⟼ ⟽ ⟾ ⟿ ⤀ ⤁ ⤂ ⤃ ⤄ ⤅ ⤆ ⤇ ⤌ ⤍ ⤎ ⤏ ⤐ ⤑ ⤔ ⤕ ⤖ ⤗ ⤘ ⤝ ⤞ ⤟ ⤠ ⥄ ⥅ ⥆ ⥇ ⥈ ⥊ ⥋ ⥎ ⥐ ⥒ ⥓ ⥖ ⥗ ⥚ ⥛ ⥞ ⥟ ⥢ ⥤ ⥦ ⥧ ⥨ ⥩ ⥪ ⥫ ⥬ ⥭ ⥰ ⧴ ⬱ ⬰ ⬲ ⬳ ⬴ ⬵ ⬶ ⬷ ⬸ ⬹ ⬺ ⬻ ⬼ ⬽ ⬾ ⬿ ⭀ ⭁ ⭂ ⭃ ⭄ ⭇ ⭈ ⭉ ⭊ ⭋ ⭌ ← → ≥ ≤ ≡ ≠ ≢ ∈ ∉ ∋ ∌ ⊆ ⊈ ⊂ ⊄ ⊊ ∝ ∊ ∍ ∥ ∦ ∷ ∺ ∻ ∽ ∾ ≁ ≃ ≄ ≅ ≆ ≇ ≈ ≉ ≊ ≋ ≌ ≍ ≎ ≐ ≑ ≒ ≓ ≔ ≕ ≖ ≗ ≘ ≙ ≚ ≛ ≜ ≝ ≞ ≟ ≣ ≦ ≧ ≨ ≩ ≪ ≫ ≬ ≭ ≮ ≯ ≰ ≱ ≲ ≳ ≴ ≵ ≶ ≷ ≸ ≹ ≺ ≻ ≼ ≽ ≾ ≿ ⊀ ⊁ ⊃ ⊅ ⊇ ⊉ ⊋ ⊏ ⊐ ⊑ ⊒ ⊜ ⊩ ⊬ ⊮ ⊰ ⊱ ⊲ ⊳ ⊴ ⊵ ⊶ ⊷ ⋍ ⋐ ⋑ ⋕ ⋖ ⋗ ⋘ ⋙ ⋚ ⋛ ⋜ ⋝ ⋞ ⋟ ⋠ ⋡ ⋢ ⋣ ⋤ ⋥ ⋦ ⋧ ⋨ ⋩ ⋪ ⋫ ⋬ ⋭ ⋲ ⋳ ⋴ ⋵ ⋶ ⋷ ⋸ ⋹ ⋺ ⋻ ⋼ ⋽ ⋾ ⋿ ⟈ ⟉ ⟒ ⦷ ⧀ ⧁ ⧡ ⧣ ⧤ ⧥ ⩦ ⩧ ⩪ ⩫ ⩬ ⩭ ⩮ ⩯ ⩰ ⩱ ⩲ ⩳ ⩴ ⩵ ⩶ ⩷ ⩸ ⩹ ⩺ ⩻ ⩼ ⩽ ⩾ ⩿ ⪀ ⪁ ⪂ ⪃ ⪄ ⪅ ⪆ ⪇ ⪈ ⪉ ⪊ ⪋ ⪌ ⪍ ⪎ ⪏ ⪐ ⪑ ⪒ ⪓ ⪔ ⪕ ⪖ ⪗ ⪘ ⪙ ⪚ ⪛ ⪜ ⪝ ⪞ ⪟ ⪠ ⪡ ⪢ ⪣ ⪤ ⪥ ⪦ ⪧ ⪨ ⪩ ⪪ ⪫ ⪬ ⪭ ⪮ ⪯ ⪰ ⪱ ⪲ ⪳ ⪴ ⪵ ⪶ ⪷ ⪸ ⪹ ⪺ ⪻ ⪼ ⪽ ⪾ ⪿ ⫀ ⫁ ⫂ ⫃ ⫄ ⫅ ⫆ ⫇ ⫈ ⫉ ⫊ ⫋ ⫌ ⫍ ⫎ ⫏ ⫐ ⫑ ⫒ ⫓ ⫔ ⫕ ⫖ ⫗ ⫘ ⫙ ⫷ ⫸ ⫹ ⫺ ⊢ ⊣ + - ⊕ ⊖ ⊞ ⊟ ∪ ∨ ⊔ ± ∓ ∔ ∸ ≂ ≏ ⊎ ⊻ ⊽ ⋎ ⋓ ⧺ ⧻ ⨈ ⨢ ⨣ ⨤ ⨥ ⨦ ⨧ ⨨ ⨩ ⨪ ⨫ ⨬ ⨭ ⨮ ⨹ ⨺ ⩁ ⩂ ⩅ ⩊ ⩌ ⩏ ⩐ ⩒ ⩔ ⩖ ⩗ ⩛ ⩝ ⩡ ⩢ ⩣ ∩ ∧ ⊗ ⊘ ⊙ ⊚ ⊛ ⊠ ⊡ ⊓ ∗ ∙ ∤ ⅋ ≀ ⊼ ⋄ ⋆ ⋇ ⋉ ⋊ ⋋ ⋌ ⋏ ⋒ ⟑ ⦸ ⦼ ⦾ ⦿ ⧶ ⧷ ⨇ ⨰ ⨱ ⨲ ⨳ ⨴ ⨵ ⨶ ⨷ ⨸ ⨻ ⨼ ⨽ ⩀ ⩃ ⩄ ⩋ ⩍ ⩎ ⩑ ⩓ ⩕ ⩘ ⩚ ⩜ ⩞ ⩟ ⩠ ⫛ ⊍ ▷ ⨝ ⟕ ⟖ ⟗ ↑ ↓ ⇵ ⟰ ⟱ ⤈ ⤉ ⤊ ⤋ ⤒ ⤓ ⥉ ⥌ ⥍ ⥏ ⥑ ⥔ ⥕ ⥘ ⥙ ⥜ ⥝ ⥠ ⥡ ⥣ ⥥ ⥮ ⥯ ↑ ↓ ] ] --- class: center, middle # Variables --- .left-column[ ## Variables ] .right-column[ * assign the value 10 to the variable `x` ```julia julia> x = 10 10 ``` * declare multiple variables at once ```julia julia> y, z = 7, 11 (7, 11) ``` * doing math with `x`'s and `y`'s values and reassign to `x` ```julia julia> x = x + y 17 ``` * assign a value of another type, like a float ```julia julia> x = 1.0 1.0 ``` * variable names are case-sensitive, have no semantic meaning, and support Unicode * must begin with a letter (A-Z or a-z), underscore or certain Unicode code points * subsequent characters may also include !, digits 0-9 and other Unicode code points * most Unicode infix operators are parsed as infix operators and are available for user-defined methods (e.g. `⊗ = kron` defines `⊗` as an infix Kronecker product) ] --- .left-column[ ## Numbers ### Ints & Floats ] .right-column[ * Julia provides a broad range of primitive numeric types * integer types: | Type | Signed? | Number of bits | Smallest value | Largest value | |:--------- |:------- |:-------------- |:-------------- |:------------- | | `Int8` | ✓ | 8 | -2^7 | 2^7 - 1 | | `UInt8` | | 8 | 0 | 2^8 - 1 | | `Int16` | ✓ | 16 | -2^15 | 2^15 - 1 | | `UInt16` | | 16 | 0 | 2^16 - 1 | | `Int32` | ✓ | 32 | -2^31 | 2^31 - 1 | | `UInt32` | | 32 | 0 | 2^32 - 1 | | `Int64` | ✓ | 64 | -2^63 | 2^63 - 1 | | `UInt64` | | 64 | 0 | 2^64 - 1 | | `Int128` | ✓ | 128 | -2^127 | 2^127 - 1 | | `UInt128` | | 128 | 0 | 2^128 - 1 | | `Bool` | N/A | 8 | `false` (0) | `true` (1) | * floating-point types: | Type | Precision | Number of bits | |:--------- |:------------------------------------------------------------------------------ |:-------------- | | `Float16` | [half](https://en.wikipedia.org/wiki/Half-precision_floating-point_format) | 16 | | `Float32` | [single](https://en.wikipedia.org/wiki/Single_precision_floating-point_format) | 32 | | `Float64` | [double](https://en.wikipedia.org/wiki/Double_precision_floating-point_format) | 64 | ] --- .left-column[ ## Numbers ### Ints & Floats #### Integers ] .right-column[ * the default type for an integer depends on the architecture of the target system ``` # 32-bit system julia> typeof(1) Int32 # 64-bit system julia> typeof(1) Int64 ``` * aliases `Int` and `UInt` refer to the system's signed and unsigned native integer types ``` # 32-bit system julia> Int Int32 julia> UInt UInt32 # 64-bit system julia> Int Int64 julia> UInt UInt64 ``` * large integer literals that cannot be represented using only 32 bits but can be represented in 64 bits always create 64-bit integers, regardless of the system type ``` # 32-bit or 64-bit system: julia> typeof(3000000000) Int64 ``` ] --- .left-column[ ## Numbers ### Ints & Floats #### Integers ] .right-column[ * the minimum and maximum representable values of primitive numeric types such as integers are given by the `typemin` and `typemax` functions ```julia julia> (typemin(Int32), typemax(Int32)) (-2147483648, 2147483647) ``` * the values returned by `typemin` and `typemax` are always of the given argument type * exceeding the maximum representable value of a given type results in a wraparound behaviour (reflecting the characteristics of the underlying integer arithmetic as implemented on modern computers) ```julia julia> x = typemax(Int64) 9223372036854775807 julia> x + 1 -9223372036854775808 julia> x + 1 == typemin(Int64) true ``` * integer division (the `div` function or `÷`) by zero and dividing the lowest negative number (`typemin`) by -1 throws a `DivideError` * the remainder and modulus functions (`rem` and `mod`) throw a `DivideError` when their second argument is zero ] --- .left-column[ ## Numbers ### Ints & Floats #### Integers #### Floats ] .right-column[ * floating-point numbers are represented in the standard formats ```julia julia> 1.0 1.0 julia> 1. 1.0 julia> 0.5 0.5 julia> .5 0.5 julia> 1e10 1.0e10 julia> -2.5e-4 -0.00025 ``` * these are all `Float64` values; `Float32` values are entered by writing an `f` in place of `e` ```julia julia> 0.5f0 0.5f0 julia> 2.5f-4 0.00025f0 ``` * note that unlike `Int`, `Float` does not exist as a type alias for a specific sized float depending on the machine architecture; unlike with integer registers, where the size of `Int` reflects the size of a native pointer on that machine, the floating point register sizes are specified by the IEEE-754 standard ] --- .left-column[ ## Numbers ### Ints & Floats #### Integers #### Floats ] .right-column[ * floating-point numbers have two zeros: positive zero and negative zero * these are equal to each other but have different binary representations, as can be seen using the bitstring function ```julia julia> 0.0 == -0.0 true julia> bitstring(0.0) "0000000000000000000000000000000000000000000000000000000000000000" julia> bitstring(-0.0) "1000000000000000000000000000000000000000000000000000000000000000" ``` * there are three specified standard floating-point values that do not correspond to any point on the real number line | `Float16` | `Float32` | `Float64` | Name | Description | |:--------- |:--------- |:--------- |:----------------- |:--------------------------------------------------------------- | | `Inf16` | `Inf32` | `Inf` | positive infinity | a value greater than all finite floating-point values | | `-Inf16` | `-Inf32` | `-Inf` | negative infinity | a value less than all finite floating-point values | | `NaN16` | `NaN32` | `NaN` | not a number | a value not `==` to any floating-point value (including itself) | ] --- .left-column[ ## Numbers ### Ints & Floats #### Integers #### Floats #### Inf & NaN ] .right-column[ * these floating-point values are the results of certain arithmetic operations (1/2) ```julia julia> 1/0 Inf julia> -5/0 -Inf julia> 0.000001/0 Inf julia> 0/0 NaN julia> 500 + Inf Inf julia> 500 - Inf -Inf julia> Inf + Inf Inf julia> Inf - Inf NaN ``` ] --- .left-column[ ## Numbers ### Ints & Floats #### Integers #### Floats #### Inf & NaN ] .right-column[ * these floating-point values are the results of certain arithmetic operations (2/2) ```julia julia> Inf * Inf Inf julia> Inf / Inf NaN julia> 0 * Inf NaN julia> 1 / Inf 0.0 ``` * the `typemin` and `typemax` functions also apply to floating-point types ```julia julia> (typemin(Float16),typemax(Float16)) (-Inf16, Inf16) julia> (typemin(Float32),typemax(Float32)) (-Inf32, Inf32) julia> (typemin(Float64),typemax(Float64)) (-Inf, Inf) ``` ] --- .left-column[ ## Numbers ### Ints & Floats #### Integers #### Floats #### Inf & NaN #### Juxtaposition ] .right-column[ * to make common numeric formulae and expressions clearer, Julia allows variables to be immediately preceded by a numeric literal, implying multiplication * this makes writing polynomial expressions much cleaner ```julia julia> x = 3 3 julia> 2x^2 - 3x + 1 10 julia> 1.5x^2 - .5x + 1 13.0 ``` * it also makes writing exponential functions more elegant ```julia julia> 2^2x 64 ``` * numeric literals also work as coefficients to parenthesized expressions ```julia julia> 2(x-1)^2 - 3(x-1) + 1 3 ``` * note that no whitespace may come between a numeric literal coefficient and the identifier or parenthesized expression which it multiplies ] --- .left-column[ ## Numbers ### Ints & Floats #### Integers #### Floats #### Inf & NaN #### Juxtaposition ] .right-column[ * parenthesized expressions can be used as coefficients to variables ```julia julia> (x-1)x 6 ``` * juxtaposition of two parenthesized expressions or placing a variable before a parenthesized expression cannot be used to imply multiplication ```julia julia> (x-1)(x+1) Error: MethodError: objects of type Int64 are not callable julia> x(x+1) Error: MethodError: objects of type Int64 are not callable ``` * both expressions are interpreted as function application * any expression that is not a numeric literal, when immediately followed by a parenthetical, is interpreted as a function applied to the values in parentheses * juxtaposed literal coefficient syntax may conflict with engineering notation for floating-point literals * the floating-point literal expression `1e10` could be interpreted as the numeric literal `1` multiplied by the variable `e10` * the 32-bit floating-point literal expression `1.5f22` could be interpreted as the numeric literal `1.5` multiplied by the variable `f22` * in all cases the ambiguity is resolved in favor of interpretation as numeric literals ] --- .left-column[ ## Numbers ### Ints & Floats #### Integers #### Floats #### Inf & NaN #### Juxtaposition #### eps ] .right-column[ * most real numbers cannot be represented exactly with floating-point (fp) numbers * machine epsilon: distance between two adjacent representable fp numbers * `eps` gives the distance between `1.0` and the next larger representable fp value ```julia julia> eps(Float32) # 2.0^-23 1.1920929f-7 julia> eps(Float64) # 2.0^-52 2.220446049250313e-16 julia> eps() # same as eps(Float64) 2.220446049250313e-16 ``` * `eps` can also take a floating-point value as an argument, and gives the absolute difference between that value and the next representable floating point value ```julia julia> eps(1.0) 2.220446049250313e-16 julia> eps(1E3) 1.1368683772161603e-13 julia> eps(1E-27) 1.793662034335766e-43 julia> eps(0.0) 5.0e-324 ``` ] --- .left-column[ ## Numbers ### Ints & Floats #### Integers #### Floats #### Inf & NaN #### Juxtaposition #### eps #### zero & one ] .right-column[ * `zero` and `one` return literal `0` and `1` corresponding to a specified type or the type of a given variable | Function | Description | |:--------- |:------------------------------------------------ | | `zero(x)` | Literal zero of type `x` or type of variable `x` | | `one(x)` | Literal one of type `x` or type of variable `x` | * avoid overhead from unnecessary type conversion * examples ```julia julia> zero(Float32) 0.0f0 julia> one(Int32) 1 julia> zero(1.0) 0.0 julia> one(0) 1 ``` ] --- .left-column[ ## Numbers ### Ints & Floats ### Arbitrary Precision ] .right-column[ * software support for Arbitrary Precision Arithmetic handles operations on numeric values that cannot be represented effectively in native hardware representations * Julia wraps the GMP library to allow computations with arbitrary-precision integers and the GNU MPFR library for arbitrary-precision floating point numbers * `BigInt` and `BigFloat` hold arbitrary precision integer and floating point numbers * constructors exist to create these types from primitive numerical types ```julia julia> BigInt(typemax(Int64)) + 1 9223372036854775808 julia> BigFloat(2.0^66) / 3 2.459565876494606882133333333333333333333333333333333333333333333333333333333344e+19 ``` * the string literal `big` or `parse` can be used to convert strings ```julia julia> big"123456789012345678901234567890" + 1 123456789012345678901234567891 julia> parse(BigInt, "123456789012345678901234567890") + 1 123456789012345678901234567891 julia> big"1.23456789012345678901" 1.234567890123456789010000000000000000000000000000000000000000000000000000000004 julia> parse(BigFloat, "1.23456789012345678901") 1.234567890123456789010000000000000000000000000000000000000000000000000000000004 ``` ] --- ``` Error: ArgumentError: Package DecFP not found in current path: - Run `import Pkg; Pkg.add("DecFP")` to install the DecFP package. ``` .left-column[ ## Numbers ### Ints & Floats ### Arbitrary Precision #### DecFP ] .right-column[ * *DecFP.jl*: wrapper around the Intel Decimal Floating-Point Math Library, providing a software implementation of the IEEE 754-2008 Decimal Floating-Point Arithmetic specification * provides 32-bit, 64-bit, and 128-bit decimal floating-point types `Dec32`, `Dec64`, and `Dec128`, respectively * much faster than arbitrary-precision floats (though still about 100x slower than the hardware binary floating-point types `Float32` and `Float64` * DecFP types can be created from primitive numerical types or strings ```julia julia> Dec64(3) Error: UndefVarError: Dec64 not defined julia> Dec64(3.5) Error: UndefVarError: Dec64 not defined julia> parse(Dec64, "3.2") Error: UndefVarError: Dec64 not defined julia> d"3.2" Error: LoadError: UndefVarError: @d_str not defined in expression starting at none:1 julia> sqrt(d128"2") Error: LoadError: UndefVarError: @d128_str not defined in expression starting at none:1 ``` * the string macro `d"3.2"` constructs `Dec64` ] --- .left-column[ ## Numbers ### Ints & Floats ### Arbitrary Precision ### Complex Numbers ] .right-column[ * Julia includes predefined types for complex numbers, and supports all the standard Mathematical Operations and Elementary Functions on them * the global constant `im` is bound to the complex number $i$, representing the principal square root of $-1$ * the juxtaposition of numeric literals with identifiers as coefficients provides convenient syntax for complex numbers, similar to the mathematical notation ```julia julia> 1+2im 1 + 2im ``` * you can perform all the standard arithmetic operations with complex numbers ```julia julia> (1 + 2im) + (1 - 2im) 2 + 0im julia> (1 + 2im) * (2 - 3im) 8 + 1im ``` * standard functions to manipulate complex values are provided ```julia julia> real(1 + 2im) 1 julia> imag(1 + 2im) 2 julia> abs(1 + 2im) 2.23606797749979 ``` ] --- .left-column[ ## Numbers ### Ints & Floats ### Arbitrary Precision ### Complex Numbers ### Rational Numbers ] .right-column[ * Julia has a rational number type to represent exact ratios of integers * rationals are constructed using the `//` operator ```julia julia> 2//3 2//3 ``` * if the numerator and denominator of a rational have common factors, they are reduced to lowest terms such that the denominator is non-negative ```julia julia> 6//9 2//3 julia> 5//-15 -1//3 ``` * standard functions to extract the numerator and denominator of a rational ```julia julia> numerator(2//3) 2 julia> denominator(2//3) 3 ``` * rationals can easily be converted to floating-point numbers ```julia julia> float(3//4) 0.75 ``` ] --- class: center, middle # Data Structures --- .left-column[ ## Data Structures ### Tuples ] .right-column[ * a tuple is a fixed-length container that can hold any values, but cannot be modified * tuples are constructed with commas and parentheses, and accessed via indexing ```julia julia> x = (0.0, "hello", 6*7) (0.0, "hello", 42) julia> x[2] "hello" ``` * a length-1 tuple must be written with a comma ```julia julia> (1,) (1,) ``` since `(1)` would just be a parenthesized value ```julia julia> (1) 1 ``` * `()` represents the empty (length-0) tuple ```julia julia> () () ``` ] --- ``` Error: ArgumentError: Package Parameters not found in current path: - Run `import Pkg; Pkg.add("Parameters")` to install the Parameters package . ``` .left-column[ ## Data Structures ### Tuples ### Named Tuples ] .right-column[ * named tuples are tuples whose components are named ```julia julia> x = (p=1, q=1+1, r=2π) (p = 1, q = 2, r = 6.283185307179586) ``` * fields of named tuples can be accessed by name using dot syntax ```julia julia> x.p 1 ``` * tuples as well as named tuples can be unpacked into separate variables ```julia julia> x1, x2, x3 = x (p = 1, q = 2, r = 6.283185307179586) julia> x1 1 ``` * *Parameters.jl* allows to selectively unpack named tuples by the `@unpack` macro ```julia julia> @unpack p, r = x Error: LoadError: UndefVarError: @unpack not defined in expression starting at none:1 julia> p Error: UndefVarError: p not defined julia> q Error: UndefVarError: q not defined julia> r Error: UndefVarError: r not defined ``` ] --- .left-column[ ## Data Structures ### Tuples ### Named Tuples ### Dictionaries ] .right-column[ * dictionaries are a mapping between a collection of keys and a collection of values, where each key is associated with a single value * the function Dict creates a new dictionary with no items ```julia julia> mydict = Dict() Dict{Any,Any} with 0 entries ``` * items can be added to a dictionary and accessed using index notation ```julia julia> mydict["one"] = 1 1 ``` ```julia julia> mydict["one"] 1 ``` * dictionaries can be initialised with key-value pairs using an arrow `=>` syntax ```julia julia> mydict = Dict("one" => 1, "two" => 2, "three" => 3) Dict{String,Int64} with 3 entries: "two" => 2 "one" => 1 "three" => 3 ``` * the `keys` and `values` functions return collections of all keys and values ```julia julia> "one" ∈ keys(mydict) true ``` ] --- .left-column[ ## Data Structures ### Tuples ### Named Tuples ### Dictionaries ### Arrays ] .right-column[ * an array is a collection of objects stored in a multi-dimensional grid * Julia provides a first-class array implementation, that is implemented almost completely in Julia itself, and derives its performance from the compiler * in the most general case, an array may contain objects of type `Any`, but for achieving good performance, arrays should contain objects of a specific type, such as `Float64` * Julia does not expect programs to be written in a vectorized style for performance * Julia's compiler uses type inference and generates optimized code for scalar array indexing, allowing programs to be written in a style that is convenient and readable, without sacrificing performance * in Julia, all arguments to functions are passed by sharing (i.e. by pointers) * arrays can be created by enclosing the elements in square brackets ```julia julia> [10, 20, 30] 3-element Array{Int64,1}: 10 20 30 julia> ["spam", 2.0, 5, [10, 20]] 4-element Array{Any,1}: "spam" 2.0 5 [10, 20] ``` ] --- .left-column[ ## Data Structures ### Tuples ### Named Tuples ### Dictionaries ### Arrays #### Initialisation ] .right-column[ * many functions for constructing and initializing arrays are provided .small[ | Function | Description | |:----------------------------------- |:----------------------------------------------------------------------- | | `zeros(T, dims...)` | an `Array` of all zeros | | `ones(T, dims...)` | an `Array` of all ones | | `trues(dims...)` | a `BitArray` with all values `true` | | `falses(dims...)` | a `BitArray` with all values `false` | | `reshape(A, dims...)` | an array containing the same data as `A`, but with different dimensions | | `copy(A)` | copy `A` | | `deepcopy(A)` | copy `A`, recursively copying its elements | | `reinterpret(T, A)` | an array with the same binary data as `A`, but element type `T` | | `randn(T, dims...)` | an `Array` with random, standard normally distributed values | | `rand(T, dims...)` | an `Array` with random, uniformly distributed values | | `Matrix{T}(I, m, n)` | `m`-by-`n` identity matrix (requires `using LinearAlgebra`) | | `range(start, stop=stop, length=n)` | range of `n` linearly spaced elements from `start` to `stop` | | `fill!(A, x)` | fill the array `A` with the value `x` | | `fill(x, dims...)` | an `Array` filled with the value `x` | ] * calls with a `dims...` argument can either take a single tuple of dimension sizes or a series of dimension sizes passed as a variable number of arguments ```julia julia> zeros(Int8, 2, 3) # equivalent to zeros(Int8, (2, 3)) 2×3 Array{Int8,2}: 0 0 0 0 0 0 ``` * most of these functions accept a first input `T`, which is the element type of the array (if omitted `T` will default to `Float64`) ] --- .left-column[ ## Data Structures ### Tuples ### Named Tuples ### Dictionaries ### Arrays #### Initialisation #### Properties ] .right-column[ * comprehensions provide a general and powerful way to construct arrays ```julia julia> X = [ 1//2^i for i in 0:2 ] 3-element Array{Rational{Int64},1}: 1//1 1//2 1//4 julia> Y = [ i*2^j for i in 1:3, j in 0:3] 3×4 Array{Int64,2}: 1 2 4 8 2 4 8 16 3 6 12 24 ``` * basic functions on arrays | Function | Description | |:-------------- |:---------------------------------------------------------------- | | `eltype(A)` | the type of the elements contained in `A` | | `length(A)` | the number of elements in `A` | | `ndims(A)` | the number of dimensions of `A` | | `size(A)` | a tuple containing the dimensions of `A` | | `size(A,n)` | the size of `A` along dimension `n` | | `axes(A)` | a tuple containing the valid indices of `A` | | `axes(A,n)` | a range expressing the valid indices along dimension `n` | | `eachindex(A)` | an efficient iterator for visiting each position in `A` | | `stride(A,k)` | linear index distance between adjacent elements in dimension `k` | | `strides(A)` | a tuple of the strides in each dimension | ] --- .left-column[ ## Data Structures ### Tuples ### Named Tuples ### Dictionaries ### Arrays #### Initialisation #### Properties #### Concatenation ] .right-column[ * concatenation | Syntax | Function | Description | |:----------------- |:-------- |:-------------------------------------------------- | | | `cat` | concatenate input arrays along dimension(s) `k` | | `[A; B; C; ...]` | `vcat` | shorthand for `cat(A...; dims=1) | | `[A B C ...]` | `hcat` | shorthand for `cat(A...; dims=2) | | `[A B; C D; ...]` | `hvcat` | simultaneous vertical and horizontal concatenation | ```julia julia> [1:2; 4:5] 4-element Array{Int64,1}: 1 2 4 5 julia> [1:2, 4:5] # Has a comma, so no concatenation occurs 2-element Array{UnitRange{Int64},1}: 1:2 4:5 julia> [1:2 4:5 7:8] 2×3 Array{Int64,2}: 1 4 7 2 5 8 julia> [[1 2] [3]; [4 5] [6]; [7 8] [9]] 3×3 Array{Int64,2}: 1 2 3 4 5 6 7 8 9 ``` ] --- .left-column[ ## Data Structures ### Tuples ### Named Tuples ### Dictionaries ### Arrays #### Initialisation #### Properties #### Concatenation #### Indexing ] .right-column[ * the general syntax for indexing an n-dimensional array `A` is `A[I_1, I_2, ..., I_n]` where each `I_k` may be * a scalar index: * an integer or `CartesianIndex{N}`, which behave like an N-tuple of integers spanning multiple dimensions * `begin` and `end` which represent the index of the first and last element in a dimension * an array of scalar indices: * vectors and multidimensional arrays of integers or `CartesianIndex{N}` * empty arrays like [], which select no elements * ranges like `a:c` or `a:b:c`, which select contiguous or strided subsections from `a` to `c` (inclusive) * an object that represents an array of scalar indices and can be converted to such by `to_indices` * `Colon()` or `(:)`, which represents all indices within an entire dimension or across the entire array * arrays of booleans, which select elements at their `true` indices * cartesian indexing: the ordinary way to index into an `N`-dimensional array is to use exactly `N` indices, where each index selects the position(s) in its particular dimension * linear indexing: when exactly one index `i` is provided, that index no longer represents a location in a particular dimension of the array, but it selects the `i`th element using the column-major iteration order that linearly spans the entire array ] --- .left-column[ ## Data Structures ### Tuples ### Named Tuples ### Dictionaries ### Arrays #### Initialisation #### Properties #### Concatenation #### Indexing ] .right-column[ * examples ```julia julia> X = reshape(collect(1:2:18), (3, 3)) 3×3 Array{Int64,2}: 1 7 13 3 9 15 5 11 17 julia> X[2,2] 9 julia> X[4] 7 julia> X[[1 4; 3 8]] 2×2 Array{Int64,2}: 1 7 5 15 julia> X[1:2:5] 3-element Array{Int64,1}: 1 5 9 julia> X[begin, :] 3-element Array{Int64,1}: 1 7 13 ``` ] --- .left-column[ ## Data Structures ### Tuples ### Named Tuples ### Dictionaries ### Arrays #### Initialisation #### Properties #### Concatenation #### Indexing ] .right-column[ * Julia is column-major, i.e., data is contiguous on the first index of an array * an array "slice" expression like `array[1:5, :]` creates a copy of that data (except on the left-hand side of an assignment `array[1:5, :] = ...`) * when doing many operations on the slice, this can be preferable because it is more efficient to work on a smaller contiguous copy than to index into the original array * when doing just a few simple operations on the slice, the cost of the allocation and copy operations can be substantial * the alternative is to create a "view" of the array, which is an array object that actually references the data of the original array in-place, without making a copy * for individual slices this can be done by calling `view`, for a whole expression or block of code this can be done by putting `@views` in front of that expression ```julia fcopy(x) = sum(x[2:end-1]); fview1(x) = sum(view(x, 2:lastindex(x)-1)); @views fview2(x) = sum(x[2:end-1]); ``` ``` fview2 (generic function with 1 method) ``` 499843.6488610484 ```julia julia> x = rand(10^6); 1000000-element Array{Float64,1}: 0.9875950262366069 0.9686560057651754 0.9736349481358342 0.06977151400751236 0.9044927351091958 0.4869178169164392 0.3075917783977482 0.449749013184668 0.06634548792459327 0.30426984671449997 ⋮ 0.1264472085871453 0.23233692784256177 0.9813554208023274 0.20287110136718267 0.6048599646419952 0.768498405921622 0.2421484190490495 0.7132668182255908 0.3546913130654228 julia> @time fcopy(x); 0.024879 seconds (3 allocations: 7.629 MiB, 81.17% gc time) 500302.2119072615 julia> @time fview1(x); 0.000554 seconds (2 allocations: 64 bytes) 500302.2119072615 julia> @time fview2(x); 0.000601 seconds (2 allocations: 64 bytes) 500302.2119072615 ``` ] --- .left-column[ ## Data Structures ### Tuples ### Named Tuples ### Dictionaries ### Arrays #### Initialisation #### Properties #### Concatenation #### Indexing ] .right-column[ * in Julia, indices start with 1, which is not always convenient * *OffsetArrays.jl* provides arrays with arbitrary indices, similar to those in Fortran * such arrays can be constructed as follows ``` julia> OA = OffsetArray(A, axis1, axis2, ...) ``` for example ```julia julia> OA = OffsetArray(reshape(1:15, 3, 5), -1:1, 0:4) 3×5 OffsetArray(reshape(::UnitRange{Int64}, 3, 5), -1:1, 0:4) with eltype Int64 with indices -1:1×0:4: 1 4 7 10 13 2 5 8 11 14 3 6 9 12 15 julia> OA[-1,0] 1 ``` * in order to write general code that supports OffsetArrays as well as other abstract arrays, it is important to write index ranges in loops, etc., *not* using `1:length(A)` or `1:size(A,1)` but using `eachindex(A)` or `axis(A,1)` instead ] --- .left-column[ ## Data Structures ### Tuples ### Named Tuples ### Dictionaries ### Arrays #### Initialisation #### Properties #### Concatenation #### Indexing #### Dot Notation ] .right-column[ * for every binary operation like `^`, there is a corresponding "dot" operation `.^` that is automatically defined to perform `^` element-by-element on arrays * example: `[1,2,3]^3` is not defined, since there is no standard mathematical meaning to "cubing" a (non-square) array, but `[1,2,3] .^ 3` is defined as computing the elementwise (or "vectorized") result `[1^3, 2^3, 3^3]` ```julia julia> [1,2,3] .^ 3 3-element Array{Int64,1}: 1 8 27 ``` * more specifically, `a .^ b` performs a broadcast operation: it can combine arrays and scalars and arrays of the same size (performing the operation elementwise) * moreover, like all vectorized "dot calls," these "dot operators" are fusing * example: if you compute `2 .* A.^2 .+ sin.(A)` (or equivalently `@. 2A^2 + sin(A)`, using the `@.` macro) for an array `A`, it performs a single loop over `A`, computing `2a^2 + sin(a)` for each element `a` of `A` * the dot syntax is also applicable to user-defined operators * example: if you define `⊗(A,B) = kron(A,B)` to give a convenient infix syntax `A ⊗ B` for Kronecker products (`kron`), then `[A,B] .⊗ [C,D]` will compute `[A⊗C, B⊗D]` with no additional coding * combining dot operators with numeric literals can be ambiguous * example: it is not clear whether `1.+x` means `1. + x` or `1 .+ x`, therefore this syntax is disallowed, and spaces must be used around the operator ] --- .left-column[ ## Data Structures ### Tuples ### Named Tuples ### Dictionaries ### Arrays #### Initialisation #### Properties #### Concatenation #### Indexing #### Dot Notation #### Broadcasting ] .right-column[ * sometimes it is useful to perform element-by-element binary operations on arrays of different sizes, such as adding a vector to each column of a matrix * to implement this efficiently Julia provides `broadcast`, which expands singleton dimensions in array arguments to match the corresponding dimension in the other array without using extra memory, and applies the given function elementwise ```julia julia> X = rand(2,3); x = rand(2,1); y = rand(1,2); 1×2 Array{Float64,2}: 0.447744 0.812205 julia> broadcast(+, x, Y) Error: DimensionMismatch("arrays could not be broadcast to a common size; got a dimension with lengths 2 and 3") julia> broadcast(+, x, y) 2×2 Array{Float64,2}: 0.97062 1.33508 0.625191 0.989652 ``` * dotted operators such as `.+` and `.*` are equivalent to broadcast calls * there is also a `broadcast!` function to specify an explicit destination ```julia julia> Y = zero(X) 2×3 Array{Float64,2}: 0.0 0.0 0.0 0.0 0.0 0.0 julia> broadcast!(+, Y, x, X) 2×3 Array{Float64,2}: 0.753055 1.48662 1.34984 0.555182 1.02065 0.422494 ``` ] --- .left-column[ ## Data Structures ### Tuples ### Named Tuples ### Dictionaries ### Arrays ### Strings ] .right-column[ * the built-in concrete type used for strings in Julia is `String` * it supports the full range of Unicode characters via the UTF-8 encoding * strings are immutable: their value cannot be changed, so to construct a different string value, you construct a new string from parts of other strings * string literals are delimited by double quotes or triple double quotes ```julia julia> str = "Hello, world." "Hello, world." ``` ```julia julia> """Contains "quote" characters""" "Contains \"quote\" characters" ``` * characters can be extracted from a string using index syntax ```julia julia> str[begin] 'H': ASCII/Unicode U+0048 (category Lu: Letter, uppercase) julia> str[5] 'o': ASCII/Unicode U+006F (category Ll: Letter, lowercase) julia> str[8:12] "world" ``` * other types can be converted to strings using `string` ```julia julia> string(42) "42" ``` ] --- .left-column[ ## Data Structures ### Tuples ### Named Tuples ### Dictionaries ### Arrays ### Strings ] .right-column[ * concatenation is achieved by passing several arguments to `string` ```julia julia> string("I am ", 10, " years old.") "I am 10 years old." ``` * Julia also provides `*` for string concatenation ```julia julia> "Hello" * ", " * "world" "Hello, world" ``` * Julia allows interpolation into string literals using `$` ```julia julia> age = 10 10 julia> "I am $age years old." "I am 10 years old." ``` ```julia julia> "Next year I will be $(age+1)." "Next year I will be 11." ``` * concatenation and string interpolation call string to convert objects into string form ] --- class: center, middle # Functions --- .left-column[ ## Functions ] .right-column[ * in Julia, a function is an object that maps a tuple of argument values to a return value * the basic syntax for defining functions in Julia is ```julia function func(x,y) x + y end ``` ``` func (generic function with 1 method) ``` this function accepts two arguments `x` and `y` and returns the value of the last expression evaluated, which is `x + y` * there also is a more compact "assignment form" of defining a function ```julia julia> func(x,y) = x + y func (generic function with 1 method) ``` in the assignment form, the body of the function must be a single expression * as simple function definitions are common in Julia, the short function syntax is accordingly quite idiomatic, considerably reducing both typing and visual noise * Unicode can also be used for function names ```julia julia> ∑(x,y) = x + y ∑ (generic function with 1 method) ``` ] --- .left-column[ ## Functions ] .right-column[ * a function is called using the traditional parenthesis syntax ```julia julia> func(2,3) 5 julia> ∑(2,3) 5 ``` * function arguments follow a convention sometimes called "pass-by-sharing", which means that values are not copied when they are passed to functions * function arguments themselves act as new variable *bindings* (new locations that can refer to values), but the values they refer to are identical to the passed values * modifications to mutable values (such as arrays) made within a function will be visible to the caller * without parentheses, the expression `func` refers to the function object, and can be passed around like any value ```julia julia> gunc = func func (generic function with 1 method) julia> gunc(2,3) 5 ``` ] --- .left-column[ ## Functions ### `return` ] .right-column[ * the value returned by a function is the value of the last expression evaluated, which, by default, is the last expression in the body of the function definition * alternatively the return keyword causes a function to return immediately, providing an expression whose value is returned ```julia function hunc(x,y) return x * y x + y end; ``` ``` hunc (generic function with 1 method) ``` ```julia julia> hunc(2,3) 6 ``` * functions that do not need to return a value, should return the value 'nothing' ```julia function printx(x) println("x = $x") return nothing end; ``` ``` printx (generic function with 1 method) ``` * a return type can be specified in the function declaration using the `::` operator ```julia function hunc(x,y)::Float64 x * y end; ``` ``` hunc (generic function with 1 method) ``` ```julia julia> hunc(2,3) 6.0 ``` ] --- .left-column[ ## Functions ### `return` ] .right-column[ * in Julia, one returns a tuple of values to simulate returning multiple values * tuples can be created and destructured without needing parentheses, thereby providing an illusion that multiple values are being returned, rather than a single tuple value ```julia function foo(a,b) a+b, a*b end; ``` ``` foo (generic function with 1 method) ``` ```julia julia> foo(2,3) (5, 6) ``` * a typical usage of such a pair of return values extracts each value into a variable (tuple "destructuring" or "unpacking") ```julia julia> x, y = foo(2,3) (5, 6) julia> x 5 julia> y 6 ``` ] --- .left-column[ ## Functions ### `return` ### Operators ] .right-column[ * in Julia, most operators are just functions with support for special syntax * accordingly, you can also apply them using parenthesized argument lists, just as you would any other function ```julia julia> 1 + 2 + 3 6 julia> +(1,2,3) 6 ``` * the infix form is exactly equivalent to the function application form: the former is parsed to produce the function call internally * thus you also can assign and pass around operators such as `+` and `*` just like you would with other function values ```julia julia> plus = + + (generic function with 185 methods) julia> plus(1,2,3) 6 ``` * however, under the name `f`, the function does not support infix notation ] --- .left-column[ ## Functions ### `return` ### Operators ] .right-column[ * a few special expressions correspond to calls to functions with non-obvious names | Expression | Calls | |:----------------- |:-------------- | | `[A B C ...]` | `hcat` | | `[A; B; C; ...]` | `vcat` | | `[A B; C D; ...]` | `hvcat` | | `A'` | `adjoint` | | `A[i]` | `getindex` | | `A[i] = x` | `setindex!` | | `A.n` | `getproperty` | | `A.n = x` | `setproperty!` | ] --- .left-column[ ## Functions ### `return` ### Operators ### Anonymous Functions ] .right-column[ * in Julia, functions can also be created anonymously, without being given a name ```julia julia> x -> x^2 + 2x - 1 #5 (generic function with 1 method) ``` * the result is a generic function, but with a compiler-generated name based on consecutive numbering * you can also assign such a function to a variable ```julia julia> poly = x -> x^2 + 2x - 1; #7 (generic function with 1 method) julia> poly(1) 2 ``` * the primary use for anonymous functions is passing them to functions which take other functions as arguments * example: `map` applies a function to each value of an array and returns a new array containing the resulting values ```julia julia> map(x -> x^2 + 2x - 1, [1, 3, -1]) 3-element Array{Int64,1}: 2 14 -2 ``` * anonymous functions can also accept multiple arguments ```julia julia> (x,y,z)->2x+y-z; #11 (generic function with 1 method) ``` ] --- .left-column[ ## Functions ### `return` ### Operators ### Anonymous Functions ] .right-column[ * passing functions as arguments to other functions is a powerful technique, but the syntax for it is not always convenient, especially when the function argument requires multiple lines * one way around this is to assign the function to a variable and use `begin ... end` ```julia func = x -> begin if x < 0 && iseven(x) return 0 elseif x == 0 return 1 else return x end end map(func, [1, 3, -1]) ``` * in addition, Julia provides a reserved word `do` for writing such code more clearly ```julia map([A, B, C]) do x # ... end ``` * the `do x` syntax creates an anonymous function with argument `x` and passes it as the first argument to `map` * similarly, `do a,b` would create a two-argument anonymous function, and a plain `do` would declare that what follows is an anonymous function of the form `() -> ...` * how these arguments are initialized depends on the "outer" function; here, `map` will sequentially set `x` to `A`, `B`, `C`, calling the anonymous function on each, just as would happen in the syntax `map(func, [A, B, C])` ] --- .left-column[ ## Functions ### `return` ### Operators ### Anonymous Functions ] .right-column[ * the `do` syntax makes it easier to use functions to effectively extend the language, since calls look like normal code blocks * example: there is a version of open that runs code ensuring that the opened file is eventually closed ```julia open("outfile", "w") do io write(io, data) end ``` * this is accomplished by the following definition ```julia function open(f::Function, args...) io = open(args...) try f(io) finally close(io) end end ``` * here, `open` first opens the file for writing and then passes the resulting output stream to the anonymous function defined in the `do ... end` block * after the function exits, `open` will make sure that the stream is properly closed, regardless of whether your function exited normally or threw an exception ] --- .left-column[ ## Functions ### `return` ### Operators ### Anonymous Functions ### Varargs ] .right-column[ * it is often convenient to be able to write functions taking an arbitrary number of arguments ("varargs" functions) * in Julia, varargs functions are defined by following the last argument with an ellipsis ````julia julia> bar(a,b,x...) = (a,b,x); bar (generic function with 1 method) julia> ```` - the variables `a` and `b` are bound to the first two argument values as usual, and the variable `x` is bound to an iterable collection of the zero or more values passed to bar after its first two arguments ```julia; term=true Error: UndefVarError: julia not defined julia> bar(1,2) (1, 2, ()) julia> bar(1,2,3) (1, 2, (3,)) julia> bar(1,2,3,4) (1, 2, (3, 4)) julia> ```` in all these cases, `x` is bound to a tuple of the trailing values passed to bar ] --- .left-column[ ## Functions ### `return` ### Operators ### Anonymous Functions ### Varargs ] .right-column[ - the values contained in an iterable collection can also be "splatted" into a function call as individual arguments, for which one also uses `...` but in the function call ```julia; term=true Error: UndefVarError: julia not defined julia> x = (2, 3, 4) (2, 3, 4) julia> bar(1,2,x...) (1, 2, (2, 3, 4)) julia> bar(1,x...) (1, 2, (3, 4)) ```` * the iterable object splatted into a function call need not be a tuple ```julia julia> x = [3, 4] 2-element Array{Int64,1}: 3 4 julia> bar(1,2,x...) (1, 2, (3, 4)) ``` * the function that arguments are splatted into need not be a varargs function ```julia julia> baz(a,b) = a + b; baz (generic function with 1 method) julia> baz(x...) 7 julia> baz(rand(3)...) Error: MethodError: no method matching baz(::Float64, ::Float64, ::Float64) Closest candidates are: baz(::Any, ::Any) at none:1 ``` ] --- .left-column[ ## Functions ### `return` ### Operators ### Anonymous Functions ### Varargs ### Optional Arguments ] .right-column[ * function arguments can also have default values in which case they might not need to be passed explicitly in every call ```julia julia> increase(x,a=1) = x+a; increase (generic function with 2 methods) ``` * with this definition, the function can be called with either one or two arguments, and `1` is automatically passed when the second argument is not specified ```julia julia> increase(4) 5 julia> increase(4,2) 6 ``` * optional arguments are actually just a convenient syntax for writing multiple method definitions with different numbers of arguments ```julia julia> methods(increase) # 2 methods for generic function "increase": [1] increase(x) in Main.##WeaveSandBox#268 at none:1 [2] increase(x, a) in Main.##WeaveSandBox#268 at none:1 ``` * default values can refer to other arguments with evaluation from left to right ```julia julia> increase(x,a=x) = x+a; increase (generic function with 2 methods) julia> increase(4) 8 ``` ] --- .left-column[ ## Functions ### `return` ### Operators ### Anonymous Functions ### Varargs ### Optional Arguments ### Keyword Arguments ] .right-column[ * some functions need a large number of arguments, or have a large number of behaviours, so that it may be difficult to remember how to call such functions * keyword arguments can make these complex interfaces easier to use and extend by allowing arguments to be identified by name instead of only by position * functions with keyword arguments are defined using a semicolon in the signature ```julia function plot(x, y; style="solid", width=1, color="black") # ... end ``` * when the function is called, the semicolon is optional: one can either call `plot(x, y, width=2)` or `plot(x, y; width=2)` * an explicit semicolon is required only for passing varargs or computed keywords * extra keyword arguments can be collected using `...`, as in varargs functions ```julia function f(x; y=0, kwargs...) # ... end ``` * inside `f`, `kwargs` will be a key-value iterator over a named tuple * named tuples can be passed as keyword arguments using a semicolon in a call ```julia julia> f(x, z=1; kwargs...) ``` ] --- .left-column[ ## Functions ### `return` ### Operators ### Anonymous Functions ### Varargs ### Optional Arguments ### Keyword Arguments ] .right-column[ * if a keyword argument is not assigned a default value, then it is required ```julia function func(x; y) # ... end; ``` ``` func (generic function with 2 methods) ``` ```julia julia> func(3, y=5) julia> func(3) Error: UndefKeywordError: keyword argument y not assigned ``` * when the keyword name is computed at runtime, one can also pass `key => value` expressions after a semicolon (here `key` needs to be a symbol) ```julia julia> plot(x, y; :width => 2) ``` is equivalent to ```julia julia> plot(x, y, width=2) ``` * if a keyword argument is specified more than once, typically by both splatting a vararg and explicitly, the rightmost occurrence takes precedence ```julia julia> plot(x, y; options..., width=2) ``` * explicitly specifying the same keyword argument multiple times is not allowed and results in a syntax error ] --- .left-column[ ## Functions ### `return` ### Operators ### Anonymous Functions ### Varargs ### Optional Arguments ### Keyword Arguments ### Composition ] .right-column[ * in Julia functions can be combined by composing or piping (chaining) them together * the function composition operator `(∘)` is used to compose the functions, so ```julia (f ∘ g)(args...) ``` is the same as ```julia f(g(args...)) ``` * the composition operator can be typed at the REPL and suitably-configured editors using `\circ
` * examples ```julia julia> (sqrt ∘ +)(3, 6) 3.0 ``` ```julia julia> map(first ∘ reverse ∘ uppercase, split("you can compose functions like this")) 6-element Array{Char,1}: 'U' 'N' 'E' 'S' 'E' 'S' ``` ] --- .left-column[ ## Functions ### `return` ### Operators ### Anonymous Functions ### Varargs ### Optional Arguments ### Keyword Arguments ### Composition ] .right-column[ * function chaining (sometimes called "piping" or "using a pipe" to send data to a subsequent function) is when you apply a function to the previous function's output ```julia julia> 1:10 |> sum |> sqrt 7.416198487095663 ``` the total produced by `sum` is passed to the `sqrt` function, which is equivalent to the composition ```julia julia> (sqrt ∘ sum)(1:10) 7.416198487095663 ``` * the pipe operator can also be used with broadcasting, as `.|>`, to provide a useful combination of the chaining/piping and dot vectorization syntax ```julia julia> ["a", "list", "of", "strings"] .|> [uppercase, reverse, titlecase, length] 4-element Array{Any,1}: "A" "tsil" "Of" 7 ``` ] --- .left-column[ ## Functions ### `return` ### Operators ### Anonymous Functions ### Varargs ### Optional Arguments ### Keyword Arguments ### Composition ### Dot Syntax ] .right-column[ * it is often convenient to have "vectorized" versions of functions, which simply apply a given function `f(x)` to each element of an array `A` to yield a new array via `f(A)` * in Julia any function `f` can be applied elementwise to any array (or other collection) with the syntax `f.(A)` ```julia julia> X = [1.0, 2.0, 3.0]; 3-element Array{Float64,1}: 1.0 2.0 3.0 julia> sin.(π .* X) 3-element Array{Float64,1}: 1.2246467991473532e-16 -2.4492935982947064e-16 3.6739403974420594e-16 ``` * internally `f.(args...)` is equivalent to `broadcast(f, args...)`, which allows you to operate on multiple arrays (even of different shapes), or a mix of arrays and scalars ```julia julia> func(x,y) = 3x + 4y; func (generic function with 2 methods) julia> Y = [4.0, 5.0, 6.0]; 3-element Array{Float64,1}: 4.0 5.0 6.0 julia> func.(π, X) 3-element Array{Float64,1}: 13.42477796076938 17.42477796076938 21.42477796076938 julia> func.(X, Y) 3-element Array{Float64,1}: 19.0 26.0 33.0 ``` ] --- .left-column[ ## Functions ### `return` ### Operators ### Anonymous Functions ### Varargs ### Optional Arguments ### Keyword Arguments ### Composition ### Dot Syntax ] .right-column[ * nested `f.(args...)` calls are fused into a single broadcast loop * example: `sin.(cos.(X))` is equivalent to `broadcast(x -> sin(cos(x)), X)`: there is only a single loop over `X`, and a single array is allocated for the result * maximum efficiency is typically achieved when the output array of a vectorized operation is pre-allocated, so that repeated calls do not allocate new arrays over and over again for the results * a convenient syntax for this is `X .= ...`, which is equivalent to `broadcast!(identity, X, ...)` except that the `broadcast!` loop is fused with any nested "dot" calls * example: `X .= sin.(Y)` is equivalent to `broadcast!(sin, X, Y)`, overwriting `X` with `sin.(Y)` in-place * if the left-hand side is an array-indexing expression, e.g. `X[begin+1:end] .= sin.(Y)`, then it translates to `broadcast!` on a `view`, e.g. ```julia julia> broadcast!(sin, view(X, firstindex(X)+1:lastindex(X)), Y) ``` so that the left-hand side is updated in-place * as adding dots to many operations and function calls in an expression can be tedious and lead to code that is difficult to read, the macro `@.` is provided to convert every function call, operation, and assignment in an expression into the "dotted" version ```julia julia> @. X = sin(cos(Y)) # equivalent to X .= sin.(cos.(Y)) ``` * binary (or unary) operators like `.+` and `.+=` are handled with the same mechanism ] --- class: center, middle # Control Flow --- .left-column[ ## Control Flow ### Compound Expressions ] .right-column[ * sometimes it is convenient to have a single expression which evaluates several subexpressions in order, returning the value of the last subexpression as its value * there are two Julia constructs that accomplish this: `begin` blocks and `;` chains with the value of both being that of the last subexpression ```julia julia> z = begin x = 1 y = 2 x + y end 3 julia> z = (x = 1; y = 2; x + y) 3 ``` * this syntax is particularly useful with the terse single-line function definition form introduced * although it is typical, there is no requirement that `begin` blocks be multiline or that `;` chains be single-line ```julia julia> begin x = 1; y = 2; x + y end 3 julia> (x = 1; y = 2; x + y) 3 ``` ] --- .left-column[ ## Control Flow ### Compound Expressions ### Conditionals ] .right-column[ * conditional evaluation allows portions of code to be evaluated or not evaluated depending on the value of a boolean expression * Julia anatomy of the `if-elseif-else` conditional syntax ```julia if x < y println("x is less than y") elseif x > y println("x is greater than y") else println("x is equal to y") end ``` * the `elseif` and `else` blocks are optional, and abitrary many `elseif` blocks can be used * the condition expressions in the `if-elseif-else` construct are evaluated until the first one evaluates to `true`, after which the associated block is evaluated, and no further condition expressions or blocks are evaluated * unlike C, MATLAB, Perl, Python, and Ruby – but like Java, and a few other stricter, typed languages – it is an error if the value of a conditional expression is anything but `true` or `false` ```julia julia> if 1 println("true") end Error: TypeError: non-boolean (Int64) used in boolean context ``` this error indicates that the conditional was of the wrong type: `Int64` rather than the required `Bool` ] --- .left-column[ ## Control Flow ### Compound Expressions ### Conditionals ] .right-column[ * `if` blocks are "leaky", i.e. they do not introduce a local scope, thus new variables defined inside the `if` clauses can be used after the `if` block, even if they weren't defined before ```julia function test(x,y) if x < y relation = "less than" elseif x == y relation = "equal to" else relation = "greater than" end println("x is ", relation, " y.") end; ``` ``` test (generic function with 1 method) ``` ```julia julia> test(2, 1) x is greater than y. ``` * when depending on this behaviour, all possible code paths must define a value for the variable ] --- .left-column[ ## Control Flow ### Compound Expressions ### Conditionals ] .right-column[ * the so-called "ternary operator", `?:`, is closely related to the `if-elseif-else` syntax, but is used where a conditional choice between single expression values is required, as opposed to conditional execution of longer blocks of code * in most languages it is the only operator taking three operands ```julia a ? b : c ``` * the expression `a`, before the `?,` is a condition expression, the expression `b`, before the `:`, is evaluated if `a` is `true` and the expression `c`, after the `:`, if `a` is `false` ```julia julia> test(x, y) = println(x < y ? "less than" : "not less than"); test (generic function with 1 method) julia> test(1, 2) less than julia> test(1, 0) not less than ``` * the three-way example requires chaining multiple uses of the ternary operator ```julia julia> test(x, y) = println(x < y ? "x is less than y" : x > y ? "x is greater than y" : "x is equal to y"); test (generic function with 1 method) julia> test(1, 1) x is equal to y ``` * the ternary operator is often used to discriminate return values ```julia mymin(x, y) = x < y ? x : y ``` ] --- .left-column[ ## Control Flow ### Compound Expressions ### Conditionals ### Short Circuit ] .right-column[ * in a series of boolean expressions connected by the `&&` and `||` boolean operators, only the minimum number of expressions are evaluated as are necessary to determine the final boolean value of the entire chain * in the expression `a && b`, the subexpression `b` is only evaluated if `a` evaluates to `true` * in the expression `a || b`, the subexpression `b` is only evaluated if `a` evaluates to `false` ```julia julia> true && true true julia> true && false false julia> false && true false julia> false && false false julia> true || true true julia> true || false true julia> false || true true julia> false || false false ``` ] --- .left-column[ ## Control Flow ### Compound Expressions ### Conditionals ### Short Circuit ] .right-column[ * this behaviour is frequently used in Julia to form an alternative to very short if statements * instead of ```julia if
end ``` one can write ```julia
&&
``` which could be read as: `
` and then `
` * instead of ```julia if !
end ``` one can write ```julia
||
``` which could be read as: `
` or else `
` * while the condition expressions used in the operands of `&&` or `||` must be boolean values (`true` or `false`), any type of expression can be used at the end of a conditional chain ] --- .left-column[ ## Control Flow ### Compound Expressions ### Conditionals ### Short Circuit ### Loops #### `for` ] .right-column[ * a `for` loop iterates through the values of an iterable, assigning each one in turn to the loop variable ```julia julia> for
in
# ... end ``` * the iterable is usually a range like `1:3`, representing the sequence of numbers 1, 2, 3 ```julia julia> for i in 1:3 println(i) end 1 2 3 ``` but it can also be an array, tuple or any other iterable container ```julia julia> for x ∈ [1.0, ℯ, π] println(x) end 1.0 2.718281828459045 3.141592653589793 ``` ```julia julia> for s = ("foo","bar") println(s) end foo bar ``` ] --- .left-column[ ## Control Flow ### Compound Expressions ### Conditionals ### Short Circuit ### Loops #### `for` ] .right-column[ * the loop variable `i` is visible only inside of the `for` loop, and not outside/afterwards ```julia julia> for i in 1:3 println(i) end 1 2 3 julia> println(i) Error: UndefVarError: i not defined ``` * even if a variable with the same name as the loop variable exists, the `for` loop does not modify it but creates a local loop variable ```julia julia> i = 0; 0 julia> for i in 1:3 println(i) end 1 2 3 julia> println(i) 0 ``` ] --- .left-column[ ## Control Flow ### Compound Expressions ### Conditionals ### Short Circuit ### Loops #### `for` ] .right-column[ * in order to use an outer local variable `i`, the `outer` keyword has to be specified ```julia function test_loop() i = 0; for outer i in 1:3 println(i) end println(i) end; ``` ``` test_loop (generic function with 1 method) ``` ```julia julia> test_loop() 1 2 3 3 ``` * one can explicitly create variables local to the `for` loop that shadow outer variables ```julia function test_loop() x = 0; for i in 1:3 local x = i^2 println(x) end println(x) end; ``` ``` test_loop (generic function with 1 method) ``` ```julia julia> test_loop() 1 4 9 0 ``` ] --- .left-column[ ## Control Flow ### Compound Expressions ### Conditionals ### Short Circuit ### Loops #### `for` ] .right-column[ * multiple nested for loops can be combined into a single outer loop, forming the cartesian product of its iterables ```julia julia> for i in 1:2, j in 3:4 println((i, j)) end (1, 3) (1, 4) (2, 3) (2, 4) ``` * with this syntax, iterables may still refer to outer loop variables; e.g. ```julia julia> for i in 1:3, j in 1:i println((i, j)) end (1, 1) (2, 1) (2, 2) (3, 1) (3, 2) (3, 3) ``` * a break statement inside such a loop exits the entire nest of loops, not just the inner one ] --- .left-column[ ## Control Flow ### Compound Expressions ### Conditionals ### Short Circuit ### Loops #### `for` ] .right-column[ * both variables (`i` and `j`) are set to their current iteration values each time the inner loop runs, therefore, assignments to `i` will not be visible to subsequent iterations ```julia julia> for i in 1:2, j in 3:4 println((i, j)) i = 0 end (1, 3) (1, 4) (2, 3) (2, 4) ``` * if this example is rewritten to use a `for` keyword for each variable, then the output would be different ```julia julia> for i in 1:2 for j in 3:4 println((i, j)) i = 0 end end (1, 3) (0, 4) (2, 3) (0, 4) ``` ] --- .left-column[ ## Control Flow ### Compound Expressions ### Conditionals ### Short Circuit ### Loops #### `for` ] .right-column[ * the iteration in a `for` loop can be stopped before the end of the iterable object is reached using the `break` keyword ```julia julia> for j in 1:1000 println(j) if j >= 5 break end end 1 2 3 4 5 ``` * one can also stop an iteration and move on to the next one immediately using the `continue` keyword ```julia julia> for i in 1:10 if i % 3 != 0 continue end println(i) end 3 6 9 ``` ] --- .left-column[ ## Control Flow ### Compound Expressions ### Conditionals ### Short Circuit ### Loops #### `for` #### `while` ] .right-column[ * a while loop evaluates a condition expression, and as long it remains `true`, keeps also evaluating the body of the while loop ```julia julia> while
# ... end ``` * if the condition expression is `false` when the `while` loop is first reached, the body is never evaluated ```julia julia> i = 0 while i > 0 println(i) i += 1 end ``` * when running in global scope, e.g. in the REPL, inside the `while` loop a loop variable `i` is accessible for reading, but not for writing unless the `global` keyword is specified ```julia julia> i = 1; 1 julia> while i ≤ 5 println(i) global i += 1 end 1 2 3 4 5 ``` * in Julia v1.5 the scope behaviour changes, so that this is not necessary anymore ] --- .left-column[ ## Control Flow ### Compound Expressions ### Conditionals ### Short Circuit ### Loops ### Exceptions ] .right-column[ * if an unexpected condition occurs and a function is unable to return a reasonable value to its caller, it may be best to either terminate the program while printing a diagnostic error message, or execute some special code to handle that exception * Julia has a long list of built-in exceptions that interrupt the normal flow of control when an unexpected condition has occurred .three-column-one[ * * `ArgumentError` * `BoundsError` * `CompositeException` * `DimensionMismatch` * `DivideError` * `DomainError` * `EOFError` * `ErrorException` * `InexactError` ] .three-column-two[ * * `InitError` * `InterruptException` * `InvalidStateException` * `KeyError` * `LoadError` * `OutOfMemoryError` * `ReadOnlyMemoryError` * `RemoteException` * `MethodError` ] .three-column-three[ * * `OverflowError` * `Meta.ParseError` * `SystemError` * `TypeError` * `UndefRefError` * `UndefVarError` * `StringIndexError` ] * example: the `sqrt` function throws a `DomainError` if applied to a negative real value ```julia julia> sqrt(-1) Error: DomainError with -1.0: sqrt will only return a complex result if called with a complex argument. Try sqrt(Complex(x)). ``` * you may define your own exceptions in the following way ```julia julia> struct MyCustomException <: Exception end ``` ] --- .left-column[ ## Control Flow ### Compound Expressions ### Conditionals ### Short Circuit ### Loops ### Exceptions #### Errors ] .right-column[ * the `error` function is used to produce an `ErrorException` that interrupts the normal flow of control * example: if we want to stop execution immediately if the square root of a negative number is taken, we can define a fussy version of the `sqrt` function that raises an `error` if its argument is negative ```julia julia> fussy_sqrt(x) = x >= 0 ? sqrt(x) : error("negative x not allowed") fussy_sqrt (generic function with 1 method) julia> fussy_sqrt(2) 1.4142135623730951 julia> fussy_sqrt(-1) Error: negative x not allowed ``` * if `fussy_sqrt` is called with a negative value from another function, instead of trying to continue execution of the calling function, it returns immediately, displaying the error message in the interactive session ```julia julia> function verbose_fussy_sqrt(x) println("before fussy_sqrt") r = fussy_sqrt(x) println("after fussy_sqrt") return r end verbose_fussy_sqrt (generic function with 1 method) julia> verbose_fussy_sqrt(-1) before fussy_sqrt Error: negative x not allowed ``` ] --- .left-column[ ## Control Flow ### Compound Expressions ### Conditionals ### Short Circuit ### Loops ### Exceptions #### Errors #### `throw` ] .right-column[ * general exceptions can be created explicitly with `throw` * the `fussy_sqrt` example, defined only for nonnegative numbers, could be rewritten to throw a `DomainError` if the argument is negative ```julia julia> fussy_sqrt(x) = x >= 0 ? sqrt(x) : throw(DomainError(x, "argument must be nonnegative")) fussy_sqrt (generic function with 1 method) julia> fussy_sqrt(2) 1.4142135623730951 julia> fussy_sqrt(-1) Error: DomainError with -1: argument must be nonnegative ``` * note that `DomainError` without parentheses is not an exception, but a type of exception, and needs to be called to obtain an `Exception` object ```julia julia> typeof(DomainError(nothing)) <: Exception true julia> typeof(DomainError) <: Exception false julia> typeof(DomainError) DataType ``` * some exception types take one or more arguments that are used for error reporting ```julia julia> throw(UndefVarError(:x)) Error: UndefVarError: x not defined ``` ] --- .left-column[ ## Control Flow ### Compound Expressions ### Conditionals ### Short Circuit ### Loops ### Exceptions #### Errors #### `throw` #### `try`/`catch` ] .right-column[ * the `try`/`catch` statement allows for `Exception`s to be tested for, and for the graceful handling of things that may ordinarily break an application * example: in the following code the `sqrt` function would normally throw an exception; by placing a `try`/`catch` block around it we can mitigate that ```julia julia> try sqrt("ten") catch println("You should have entered a numeric value") end You should have entered a numeric value ``` * one may choose how to handle this exception, whether logging it, returning a placeholder value or just printing out a statement * drawback: using a `try`/`catch` block is much slower than using conditional branching to handle those situations * the power of the `try`/`catch` construct lies in the ability to unwind a deeply nested computation immediately to a much higher level in the stack of calling functions ] --- .left-column[ ## Control Flow ### Compound Expressions ### Conditionals ### Short Circuit ### Loops ### Exceptions #### Errors #### `throw` #### `try`/`catch` ] .right-column[ * `try`/`catch` statements allow the `Exception` to be saved in a variable, e.g. to branch depending on the type of the `Exception` ```julia julia> sqrt_inv(x) = try sqrt(1 ÷ x) catch e if isa(e, DivideError) println("DivideError: you could compute sqrt(1 / x) instead.") elseif isa(e, DomainError) println("DomainError: you could compute sqrt(complex(1 ÷ x, 0)) instead.") end end sqrt_inv (generic function with 1 method) julia> sqrt_inv(1) 1.0 julia> sqrt_inv(0) DivideError: you could compute sqrt(1 / x) instead. julia> sqrt_inv(-1) DomainError: you could compute sqrt(complex(1 ÷ x, 0)) instead. ``` * the symbol following `catch` is always interpreted as a name for the exception, so care is needed when writing `try`/`catch` expressions on a single line ```julia julia> x = -1; -1 julia> try sqrt(x) catch x end julia> try sqrt(x) catch; x end -1 ``` ] --- .left-column[ ## Control Flow ### Compound Expressions ### Conditionals ### Short Circuit ### Loops ### Exceptions #### Errors #### `throw` #### `try`/`catch` ] .right-column[ * `rethrow` allows to rethrow the current exception from within a `catch` block, so that the rethrown exception will continue propagation as if it had not been caught ```julia julia> sqrt_inv(x) = try sqrt(1 ÷ x) catch e if isa(e, DivideError) println("DivideError: you could compute sqrt(1 / x) instead.") else rethrow() end end sqrt_inv (generic function with 1 method) julia> sqrt_inv(1) 1.0 julia> sqrt_inv(0) DivideError: you could compute sqrt(1 / x) instead. julia> sqrt_inv(-1) Error: DomainError with -1.0: sqrt will only return a complex result if called with a complex argument. Try sqrt(Complex(x)). ``` ] --- .left-column[ ## Control Flow ### Compound Expressions ### Conditionals ### Short Circuit ### Loops ### Exceptions #### Errors #### `throw` #### `try`/`catch` #### `finally` ] .right-column[ * in code that performs state changes or uses resources like files, there is typically clean-up work (such as closing files) that needs to be done when the code is finished * exceptions potentially complicate this task, since they can cause a block of code to exit before reaching its normal end * the finally keyword provides a way to run some code when a given block of code exits, regardless of how it exits ```julia f = open("file") try # operate on file f finally close(f) end ``` * when control leaves the `try` block (for example due to a `return`, or just finishing normally), `close(f)` will be executed * if the `try` block exits due to an exception, the exception will continue propagating * if a `catch` block os combined with `try` and `finally`, the `finally` block will run after `catch` has handled the error ] --- class: center, middle # Types --- .left-column[ ## Types ] .right-column[ * traditionally, there exist two quite different kinds of type systems: * static type systens, where every program expression must have a type computable before execution of the program * dynamic type systems, where nothing is known about types until run time, when the actual values manipulated by the program are available * the ability to write code that can operate on different types is called polymorphism * in classic dynamically typed languages all code is polymorphic: only by explicitly checking types, or when objects fail to support operations at run-time, are the types of any values ever restricted * Julia's type system is dynamic, but allows to indicate that certain values are of specific types * this can be of great assistance to the compiler in generating efficient code, but even more significantly, it allows method dispatch on the types of function arguments to be deeply integrated with the language * Julia's default behaviour when types are omitted is to allow values to be of any type; thus, one can write many useful Julia functions without ever explicitly using types * when additional expressiveness is needed, it is easy to gradually introduce explicit type annotations into previously "untyped" code; this serves three primary purposes * to take advantage of Julia's powerful multiple-dispatch mechanism * to improve human readability * and to catch programmer errors ] --- .left-column[ ## Types ] .right-column[ * Julia's type system is dynamic, nominative (name-based) and parametric * generic types can be parameterized * the hierarchical relationships between types are explicitly declared, rather than implied by compatible structure * concrete types may not subtype each other: all concrete types are final and may only have abstract types as their supertypes * other high-level aspects of Julia's type system: * there is no division between object and non-object values: all values in Julia are true objects having a type that belongs to a single, fully connected type graph, all nodes of which are equally first-class as types * there is no meaningful concept of a "compile-time type": the only type a value has is its actual type when the program is running; this is called a "run-time type" in object-oriented languages where the combination of static compilation with polymorphism makes this distinction significant * only values, not variables, have types; variables are simply names bound to values * both abstract and concrete types can be parameterized by other types, by symbols, by values of certain types (essentially, things like numbers and bools), and also by tuples thereof; type parameters may be omitted when they do not need to be referenced or restricted * while many Julia programmers may never feel the need to write code that explicitly uses types, some kinds of programming become clearer, simpler, faster and more robust with declared types ] --- .left-column[ ## Types ### Type Declarations ] .right-column[ * there are two primary reasons to attach type annotations to expressions and variables in programs: * as an assertion to help confirm that your program works the way you expect, * to provide extra type information to the compiler, which can then improve performance in some cases * in Julia, the `::` operator can be used to annotate types * when appended to an expression computing a value, the `::` operator is read as "is an instance of"; it can be used anywhere to assert that the value of the expression on the left is an instance of the type on the right * when the type on the right is concrete, the value on the left must have that type as its implementation (recall that all concrete types are final, so no implementation is a subtype of any other) * when the type on the right is abstract, it suffices for the value to be implemented by a concrete type that is a subtype of the abstract type * if the type assertion is not true, an exception is thrown, otherwise, the left-hand value is returned ```julia julia> (1+2)::AbstractFloat Error: TypeError: in typeassert, expected AbstractFloat, got Int64 julia> (1+2)::Int 3 ``` * this allows a type assertion to be attached to any expression in-place ] --- .left-column[ ## Types ### Type Declarations ] .right-column[ * when the `::` operator is appended to a variable on the left-hand side of an assignment, or as part of a local declaration, it declares the variable to always have the specified type, like a type declaration in a statically-typed language such as C * every value assigned to the variable will be converted to the declared type ```julia function hundred() x::Int8 = 100 x end; ``` ``` hundred (generic function with 1 method) ``` ```julia julia> hundred() 100 julia> typeof(hundred()) Int8 ``` * this feature is useful for avoiding performance "gotchas" that could occur if one of the assignments to a variable changed its type unexpectedly * this "declaration" behaviour only occurs in specific contexts and applies to the whole current scope, even before the declaration ```julia julia> local x::Int8 # in a local declaration julia> x::Int8 = 10 # as the left-hand side of an assignment Error: syntax: type declarations on global variables are not yet supported ``` * currently, type declarations cannot be used in global scope, e.g. in the REPL, since Julia does not yet have constant-type globals ] --- .left-column[ ## Types ### Type Declarations ] .right-column[ * declarations can also be attached to function definitions to declare their return type ```julia function sinc(x)::Float64 if x == 0 return 1 end return sin(pi*x)/(pi*x) end; ``` ``` sinc (generic function with 1 method) ``` ```julia julia> sinc(0) 1.0 julia> typeof(sinc(0)) Float64 julia> sinc(1) 3.8981718325193755e-17 julia> typeof(sinc(1)) Float64 ``` * returning from this function behaves just like an assignment to a variable with a declared type: the value is always converted to `Float64` ] --- .left-column[ ## Types ### Type Declarations ### Abstract Types ] .right-column[ * in Julia concrete types may not subtype each other: all concrete types are final and may only have abstract types as their supertypes * this might at first seem unduly restrictive, but in fact has many beneficial consequences with surprisingly few drawbacks * it turns out that being able to inherit behaviour is much more important than being able to inherit structure, and inheriting both causes significant difficulties in traditional object-oriented languages * abstract types cannot be instantiated, and serve only as nodes in the type graph, thereby describing sets of related concrete types: those concrete types which are their descendants * abstract types, even though they have no instantiation, are the backbone of Julia's type system; they form the conceptual hierarchy which makes Julia's type system more than just a collection of object implementations ] --- .left-column[ ## Types ### Type Declarations ### Abstract Types ] .right-column[ * when discussing integers and floating-point numbers, we introduced a variety of concrete types of numeric values * although they have different representation sizes, `Int8`, `Int16`, `Int32`, `Int64` and `Int128` all have in common that they are signed integer type * likewise `UInt8`, `UInt16`, `UInt32`, `UInt64` and `UInt128` are all unsigned integer types * while `Float16`, `Float32` and `Float64` are distinct in being floating-point types rather than integers * it is common for a piece of code to make sense, for example, only if its arguments are some kind of integer, but not really depend on what particular kind of integer, e.g., example, the greatest common denominator algorithm works for all kinds of integers, but will not work for floating-point numbers * abstract types allow the construction of a hierarchy of types, providing a context into which concrete types can fit * this allows you, for example, to easily program to any type that is an integer, without restricting an algorithm to a specific type of integer ] --- .left-column[ ## Types ### Type Declarations ### Abstract Types ] .right-column[ * the `abstract type` keyword introduces a new abstract type with the name `«name»` ```julia abstract type «name» end ``` * this name can be optionally followed by `<:` and an already-existing type, indicating that the newly declared abstract type is a subtype of this "parent" type ```julia abstract type «name» <: «supertype» end ``` * when no supertype is given, the default supertype is `Any` – a predefined abstract type that all objects are instances of and all types are subtypes of * in type theory, `Any` is commonly called "top" as it is at the apex of the type graph * Julia also has a predefined abstract "bottom" type, at the nadir of the type graph, which is written as `Union{}` * it is the exact opposite of `Any`: no object is an instance of `Union{}` and all types are supertypes of `Union{}` * the `<:` operator in general means "is a subtype of"; used in declarations, it declares the right-hand type to be an immediate supertype of the newly declared type * it can also be used in expressions as a subtype operator which returns true when its left operand is a subtype of its right operand ```julia julia> Integer <: Number true julia> String <: Number false ``` ] --- .left-column[ ## Types ### Type Declarations ### Abstract Types ] .right-column[ * example: let us consider the abstract types that make up Julia's numerical hierarchy ```julia abstract type Number end abstract type Real <: Number end abstract type AbstractFloat <: Real end abstract type Integer <: Real end abstract type Signed <: Integer end abstract type Unsigned <: Integer end ``` * the `Number` type is a direct child type of `Any`, and `Real` is its child * in turn, `Real` has several children, two of which are shown here: `Integer` and `AbstractFloat`, separating the world into representations of integers and representations of real numbers * representations of real numbers include, of course, floating-point types, but also include other types, such as rationals * hence, `AbstractFloat` is a proper subtype of `Real`, including only floating-point representations of real numbers * integers are further subdivided into `Signed` and `Unsigned` varieties ] --- .left-column[ ## Types ### Type Declarations ### Abstract Types ### Primitive Types ] .right-column[ * a primitive type is a concrete type whose data consists of plain old bits such as integers and floating-point values * unlike most languages, Julia lets you declare your own primitive types, rather than providing only a fixed set of built-in ones * in fact, the standard primitive types are all defined in the language itself: ```julia primitive type Float16 <: AbstractFloat 16 end primitive type Float32 <: AbstractFloat 32 end primitive type Float64 <: AbstractFloat 64 end primitive type Bool <: Integer 8 end primitive type Char <: AbstractChar 32 end primitive type Int8 <: Signed 8 end primitive type UInt8 <: Unsigned 8 end primitive type Int16 <: Signed 16 end primitive type UInt16 <: Unsigned 16 end primitive type Int32 <: Signed 32 end primitive type UInt32 <: Unsigned 32 end primitive type Int64 <: Signed 64 end primitive type UInt64 <: Unsigned 64 end primitive type Int128 <: Signed 128 end primitive type UInt128 <: Unsigned 128 end ``` * the general syntaxes for declaring a primitive type are ```julia primitive type «name» «bits» end primitive type «name» <: «supertype» «bits» end ``` * the number of bits indicates how much storage the type requires and the name gives the new type a name ] --- .left-column[ ## Types ### Type Declarations ### Abstract Types ### Primitive Types ] .right-column[ * a primitive type can optionally be declared to be a subtype of some supertype, which defaults to `Any` if an immediate supertype is omitted * the declaration of `Bool` therefore means that a boolean value takes eight bits to store, and has `Integer` as its immediate supertype ```julia primitive type Bool <: Integer 8 end ``` * currently, only sizes that are multiples of 8 bits are supported, therefore boolean values, although they really need just a single bit, cannot be declared to be any smaller than eight bits * the types `Bool`, `Int8` and `UInt8` all have identical representations: they are eight-bit chunks of memory ```julia primitive type Int8 <: Signed 8 end primitive type UInt8 <: Unsigned 8 end ``` * since Julia's type system is nominative, however, they are not interchangeable despite having identical structure * a fundamental difference between them is that they have different supertypes: `Bool`'s direct supertype is `Integer`, `Int8`'s is `Signed`, and `UInt8`'s is `Unsigned` * all other differences between `Bool`, `Int8`, and `UInt8` are matters of behaviour: the way functions are defined to act when given objects of these types as arguments * this is why a nominative (or name-based) type system is necessary: if structure determined type, which in turn dictates behaviour, then it would be impossible to make `Bool` behave any differently than `Int8` or `UInt8` ] --- .left-column[ ## Types ### Type Declarations ### Abstract Types ### Primitive Types ### Composite Types ] .right-column[ * composite types are called records, structs, or objects in various languages * a composite type is a collection of named fields, an instance of which can be treated as a single value * in mainstream object oriented languages, such as C++, Java, Python and Ruby, composite types also have named functions associated with them, and the combination is called an "object" * in purer object-oriented languages, such as Ruby or Smalltalk, all values are objects whether they are composites or not * in less pure object oriented languages, including C++ and Java, some values, such as integers and floating-point values, are not objects, while instances of user-defined composite types are true objects with associated methods * in Julia, all values are objects, but functions are not bundled with the objects they operate on * this is necessary since Julia chooses which method of a function to use by *multiple dispatch*, meaning that the types of *all* of a function's arguments are considered when selecting a method, rather than just the first one * thus, it would be inappropriate for functions to "belong" to only their first argument * organizing methods into function objects rather than having named bags of methods "inside" each object ends up being a highly beneficial aspect of the language design ] --- .left-column[ ## Types ### Type Declarations ### Abstract Types ### Primitive Types ### Composite Types ] .right-column[ * composite types are introduced with the `struct` keyword followed by a block of field names, optionally annotated with types using the `::` operator ```julia struct Foo bar baz::Int qux::Float64 end ``` * fields with no type annotation default to `Any`, and can thus hold any type of value * new objects of type `Foo` are created by applying the `Foo` type object like a function to values for its fields ```julia julia> foo = Foo("Hello, world.", 23, 1.5) Error: invalid redefinition of constant foo julia> typeof(foo) typeof(Main.##WeaveSandBox#268.foo) ``` * when a type is applied like a function it is called a *constructor* * two constructors are generated automatically (referred to as *default constructors*): * one accepts any arguments and tries to convert them to the types of the fields * and the other accepts arguments that match the field types exactly * since the `bar` field is unconstrained in type, any value will do, jowever, the value for `baz` must be convertible to `Int` ```julia julia> Foo((), 23.5, 1) Error: InexactError: Int64(23.5) ``` ] --- .left-column[ ## Types ### Type Declarations ### Abstract Types ### Primitive Types ### Composite Types ] .right-column[ * the `fieldnames` function returns a list of field names ```julia julia> fieldnames(Foo) (:bar, :baz, :qux) ``` * the field values of a composite object can be accessed using the `foo.bar` notation ```julia julia> foo.bar Error: type #foo has no field bar julia> foo.baz Error: type #foo has no field baz julia> foo.qux Error: type #foo has no field qux ``` * composite objects declared with `struct` are immutable; they cannot be modified after construction * this may seem odd at first, but it has several advantages: * it can be more efficient; some structs can be packed efficiently into arrays, and in some cases the compiler is able to avoid allocating immutable objects entirely * it is not possible to violate the invariants provided by the type's constructors * code using immutable objects can be easier to reason about * an immutable object might contain mutable objects, such as arrays, as fields; those contained objects will remain mutable; only the fields of the immutable object itself cannot be changed to point to different objects * mutable composite objects can be declared with the keyword `mutable struct` ] --- .left-column[ ## Types ### Type Declarations ### Abstract Types ### Primitive Types ### Composite Types #### Singletons ] .right-column[ * immutable composite types with no fields are singletons: there can be only one instance of such types ```julia julia> struct NoFields end julia> NoFields() === NoFields() true ``` * the `===` function confirms that the "two" constructed instances of `NoFields` are actually one and the same * without a discussion of *Parametric Methods* and *Conversion*, it is difficult to explain the utility of the singleton type construct, but in short, it allows one to specialize function behaviour on specific type values * this is useful for writing methods (especially parametric ones) whose behaviour depends on a type that is given as an explicit argument rather than implied by the type of one of its arguments ] --- .left-column[ ## Types ### Type Declarations ### Abstract Types ### Primitive Types ### Composite Types #### Singletons #### Mutable Composite Types ] .right-column[ * if a composite type is declared with `mutable struct` instead of `struct`, then instances of it can be modified ```julia mutable struct Bar baz qux::Float64 end ``` ```julia julia> bar = Bar("Hello", 1.5); Error: invalid redefinition of constant bar julia> bar.qux = 2.0 Error: type #bar has no field qux julia> bar.baz = 1//2 Error: type #bar has no field baz ``` * mutable values are generally allocated on the heap, have stable memory addresses, and are passed to functions as pointers * a mutable object is like a little container that might hold different values over time, and so can only be reliably identified with its address * in contrast, an instance of an immutable type is associated with specific field values and the field values alone tell you everything about the object * in deciding whether to make a type mutable, ask whether two instances with the same field values would be considered identical, or if they might need to change independently over time; if they would be considered identical, the type should probably be immutable ] --- .left-column[ ## Types ### Type Declarations ### Abstract Types ### Primitive Types ### Composite Types ### Immutability ] .right-column[ Two essential properties define immutability in Julia: * it is not permitted to modify the value of an immutable type * for bits types this means that the bit pattern of a value once set will never change and that value is the identity of a bits type * for composite types, this means that the identity of the values of its fields will never change. When the fields are bits types, that means their bits will never change, for fields whose values are mutable types like arrays, that means the fields will always refer to the same mutable value even though that mutable value's content may itself be modified * an object with an immutable type may be copied freely by the compiler since its immutability makes it impossible to programmatically distinguish between the original object and a copy * in particular, this means that small enough immutable values like integers and floats are typically passed to functions in registers (or stack allocated). * mutable values, on the other hand are heap-allocated and passed to functions as pointers to heap-allocated values except in cases where the compiler is sure that there's no way to tell that this is not what is happening ] --- .left-column[ ## Types ### Type Declarations ### Abstract Types ### Primitive Types ### Composite Types ### Immutability ### Declared Types ] .right-column[ * the three kinds of types (abstract, primitive, composite) are actually all closely related: they share the same key properties * they are explicitly declared * they have names * they have explicitly declared supertypes * they may have parameters * because of these shared properties, these types are internally represented as instances of the same concept, `DataType`, which is the type of any of these types ```julia julia> typeof(Real) DataType julia> typeof(Int) DataType ``` * a `DataType` may be abstract or concrete * if it is concrete, it has a specified size, storage layout, and (optionally) field names * a primitive type is a `DataType` with nonzero size, but no field names * a composite type is a `DataType` that has field names or is empty (zero size) * every concrete value in the system is an instance of some `DataType` ] --- .left-column[ ## Types ### Type Declarations ### Abstract Types ### Primitive Types ### Composite Types ### Immutability ### Declared Types ### Type Unions ] .right-column[ * a type union is a special abstract type which includes as objects all instances of any of its argument types, constructed using the special `Union` keyword ```julia julia> IntOrString = Union{Int,AbstractString} Union{Int64, AbstractString} julia> 1 :: IntOrString 1 julia> "Hello!" :: IntOrString "Hello!" julia> 1.0 :: IntOrString Error: TypeError: in typeassert, expected Union{Int64, AbstractString}, got Float64 ``` * the compilers for many languages have an internal union construct for reasoning about types; Julia simply exposes it to the programmer * the Julia compiler is able to generate efficient code in the presence of `Union` types with a small number of types, by generating specialized code in separate branches for each possible type * a particularly useful case of a `Union` type is `Union{T, Nothing}`, where `T` can be any type and `Nothing` is the singleton type whose only instance is the object `nothing`` * this is the Julia equivalent of Nullable, Option or Maybe types in other languages * declaring a function argument or a field as `Union{T, Nothing}` allows setting it either to a value of type `T`, or to `nothing` to indicate that there is no value * note that Julia objects cannot be "null" by default: when a reference (variable, object field, or array element) is uninitialized, accessing it will immediately throw an error ] --- .left-column[ ## Types ### Type Declarations ### Abstract Types ### Primitive Types ### Composite Types ### Immutability ### Declared Types ### Type Unions ### Type Aliases ] .right-column[ * sometimes it is convenient to introduce a new name for an already expressible type * this can be done with a simple assignment statement * for example, `UInt` is aliased to either `UInt32` or `UInt64` as is appropriate for the size of pointers on the system ```julia # 32-bit system: julia> UInt UInt32 # 64-bit system: julia> UInt UInt64 ``` * this is accomplished via the following code in `base/boot.jl` ```julia if Int === Int64 const UInt = UInt64 else const UInt = UInt32 end ``` ``` UInt64 ``` * of course, this depends on what `Int` is aliased to, but that is predefined to be the correct type, either `Int32` or `Int64` ] --- .left-column[ ## Types ### Type Declarations ### Abstract Types ### Primitive Types ### Composite Types ### Immutability ### Declared Types ### Type Unions ### Type Aliases ### Operations on Types ] .right-column[ * since types in Julia are themselves objects, ordinary functions can operate on them * some functions that are particularly useful for working with or exploring types have already been introduced, such as the `<:` operator, which indicates whether its left hand operand is a subtype of its right hand operand * the `isa` function tests if an object is of a given type and returns `true` or `false` ```julia julia> isa(1, Int) true julia> isa(1, AbstractFloat) false ``` * the `typeof` function returns the type of its argument; since types are objects, they also have types ```julia julia> typeof(Rational{Int}) DataType julia> typeof(Union{Int,String}) Union ``` * What if we repeat the process? What is the type of a type of a type? In Julia, types are all composite values and thus all have a type of `DataType`, which is its own type ```julia julia> typeof(DataType) DataType julia> typeof(Union) DataType ``` ] --- .left-column[ ## Types ### Type Declarations ### Abstract Types ### Primitive Types ### Composite Types ### Immutability ### Declared Types ### Type Unions ### Type Aliases ### Operations on Types ] .right-column[ * the function `supertype` reveals a type's supertype * only declared types (`DataType`) have unambiguous supertypes ```julia julia> supertype(Float64) AbstractFloat julia> supertype(Number) Any julia> supertype(AbstractString) Any julia> supertype(Any) Any ``` * if `supertype` is applied to other type objects (or non-type objects), a `MethodError` is raised ```julia julia> supertype(Union{Float64,Int64}) Error: MethodError: no method matching supertype(::Type{Union{Float64, Int64}}) Closest candidates are: supertype(!Matched::DataType) at operators.jl:44 supertype(!Matched::UnionAll) at operators.jl:49 ``` ] --- class: center, middle # Methods --- .left-column[ ## Methods ] .right-column[ * a function is an object that maps a tuple of arguments to a return value, or throws an exception if no appropriate value can be returned * it is common for the same conceptual function or operation to be implemented quite differently for different types of arguments: adding two integers is very different from adding two floating-point numbers, both of which are distinct from adding an integer to a floating-point number * despite their implementation differences, these operations all fall under the general concept of "addition"; accordingly, in Julia, these behaviours all belong to a single object, that is the `+` function * to facilitate using many different implementations of the same concept smoothly, functions need not be defined all at once, but can be defined piecewise by providing specific behaviours for certain combinations of argument types and counts * a definition of one possible behaviour for a function is called a *method* * thus far, most examples of functions we considered were defined with a single method, applicable to all types of arguments * the signatures of method definitions can be annotated to indicate the types of arguments in addition to their number, and more than a single method definition may be provided * when a function is applied to a particular tuple of arguments, the most specific method applicable to those arguments is applied ] --- .left-column[ ## Methods ### Dispatch ] .right-column[ * the choice of which method to execute when a function is applied is called dispatch * Julia allows the dispatch process to choose which of a function's methods to call based on the number of arguments given, and on the types of all of the function's arguments * this is different from traditional object-oriented languages, where dispatch occurs based only on the first argument, which often has a special argument syntax, and is sometimes implied rather than explicitly written as an argument * in C++ or Java, for example, in a method call like `obj.meth(arg1,arg2)`, the object `obj` "receives" the method call and is implicitly passed to the method via the `this` keyword, rather than as an explicit method argument * when the current `this` object is the receiver of a method call, it can be omitted altogether, writing just `meth(arg1,arg2)`, with `this` implied as the receiving object * using all of a function's arguments to choose which method should be invoked, rather than just the first, is known as *multiple dispatch* * multiple dispatch is particularly useful for mathematical code, where it makes little sense to artificially deem the operations to "belong" to one argument more than any of the others: does the addition operation in `x + y` belong to `x` any more than it does to `y`? * the implementation of a mathematical operator generally depends on the types of all of its arguments, but even beyond mathematical operations multiple dispatch ends up being a powerful and convenient paradigm for structuring and organizing programs ] --- .left-column[ ## Methods ### Dispatch ### Defining Methods ] .right-column[ * functions with a single method and unconstrained argument types behave just like they would in traditional dynamically typed languages * when defining a function, one can optionally constrain the types of parameters it is applicable to, using the `::` type-assertion operator ```julia julia> f(x::Float64, y::Float64) = 2x + y f (generic function with 1 method) ``` * this definition applies only to calls where `x` and `y` are both values of type `Float64` ```julia julia> f(2.0, 3.0) 7.0 ``` * applying it to any other types of arguments will result in a `MethodError` ```julia julia> f(2.0, 3) Error: MethodError: no method matching f(::Float64, ::Int64) Closest candidates are: f(::Float64, !Matched::Float64) at none:1 julia> f(2.0f0, 3.0) Error: MethodError: no method matching f(::Float32, ::Float64) Closest candidates are: f(!Matched::Float64, ::Float64) at none:1 julia> f(2.0, "3.0") Error: MethodError: no method matching f(::Float64, ::String) Closest candidates are: f(::Float64, !Matched::Float64) at none:1 julia> f("2.0", "3.0") Error: MethodError: no method matching f(::String, ::String) ``` ] --- .left-column[ ## Methods ### Dispatch ### Defining Methods ] .right-column[ * in the previous example, arguments must be precisely of type `Float64`; other numeric types, such as integers or 32-bit floating-point values, are not automatically converted to 64-bit floating-point, nor are strings parsed as numbers * because `Float64` is a concrete type and concrete types cannot be subclassed in Julia, such a definition can only be applied to arguments that are exactly of type `Float64` * often it is useful to write more general methods where the declared parameter types are abstract ```julia julia> f(x::Number, y::Number) = 2x - y f (generic function with 2 methods) julia> f(2.0, 3) 1.0 ``` * this method definition applies to any pair of arguments that are instances of `Number` * they need not be of the same type, so long as they are each numeric values * the problem of handling disparate numeric types is delegated to the arithmetic operations in the expression `2x - y` ] --- .left-column[ ## Methods ### Dispatch ### Defining Methods ] .right-column[ * to define a function with multiple methods, one simply defines the function multiple times, with different numbers and types of arguments * the first method definition for a function creates the function object, and subsequent method definitions add new methods to the existing function object * the most specific method definition matching the number and types of the arguments will be executed when the function is applied * in the example above, the two method definitions, taken together, define the behaviour for `f` over all pairs of instances of the abstract type `Number`, but with a different behaviour specific to pairs of `Float64` values * if only one of the arguments is a 64-bit float but the other one is not, then the `f(Float64, Float64)` method cannot be called and the more general `f(Number, Number)` method must be used ```julia julia> f(x::Number, y::Number) = 2x - y f (generic function with 2 methods) julia> f(2.0, 3.0) 7.0 julia> f(2.0, 3) 1.0 julia> f(2, 3.0) 1.0 julia> f(2, 3) 1 ``` ] --- .left-column[ ## Methods ### Dispatch ### Defining Methods ] .right-column[ * for non-numeric values, and for fewer or more than two arguments, the function `f` remains undefined, and applying it will still result in a `MethodError` ```julia julia> f("foo", 3) Error: MethodError: no method matching f(::String, ::Int64) Closest candidates are: f(!Matched::Number, ::Number) at none:1 julia> f() Error: MethodError: no method matching f() Closest candidates are: f(!Matched::Float64, !Matched::Float64) at none:1 f(!Matched::Number, !Matched::Number) at none:1 ``` * entering the function object itself in an interactive session lists which methods exist for a function ```julia julia> f f (generic function with 2 methods) ``` * the `methods` function shows the signatures of those methods ```julia julia> methods(f) # 2 methods for generic function "f": [1] f(x::Float64, y::Float64) in Main.##WeaveSandBox#268 at none:1 [2] f(x::Number, y::Number) in Main.##WeaveSandBox#268 at none:1 ``` * it also indicates the file and line number where the methods were defined: because these methods were defined at the REPL, we get the apparent line number `none:1` ] --- .left-column[ ## Methods ### Dispatch ### Defining Methods ] .right-column[ * in the absence of a type declaration with `::`, the type of a method parameter is `Any` by default, meaning that it is unconstrained since all values in Julia are instances of the abstract type `Any` ```julia julia> f(x,y) = println("Sorry, but we don't have a method for f with parameter types ($(typeof(x)),$(typeof(y)))") f (generic function with 3 methods) julia> f("foo", 1) Sorry, but we don't have a method for f with parameter types (String,Int64) ``` * this catch-all is less specific than any other possible method definition for a pair of parameter values, so it will only be called on pairs of arguments to which no other method definition applies * although it seems like a simple concept, multiple dispatch on the types of values is perhaps the single most powerful and central feature of the Julia language * multiple dispatch together with the flexible parametric type system give Julia its ability to abstractly express high-level algorithms decoupled from implementation details, yet generate efficient, specialized code to handle each case at run time ] --- .left-column[ ## Methods ### Dispatch ### Defining Methods ### Method Ambiguities ] .right-column[ * it is possible to define a set of function methods such that there is no unique most specific method applicable to some combinations of arguments ```julia julia> g(x::Float64, y) = 2x + y; g (generic function with 1 method) julia> g(x, y::Float64) = x + 2y; g (generic function with 2 methods) julia> g(2.0, 3) 7.0 julia> g(2, 3.0) 8.0 julia> g(2.0, 3.0) Error: MethodError: g(::Float64, ::Float64) is ambiguous. Candidates: g(x::Float64, y) in Main.##WeaveSandBox#268 at none:1 g(x, y::Float64) in Main.##WeaveSandBox#268 at none:1 Possible fix, define g(::Float64, ::Float64) ``` * the call `g(2.0, 3.0)` could be handled by either `g(Float64, Any)` or `g(Any, Float64`, and neither is more specific than the other, thus Julia raises a `MethodError` * method ambiguities can be avoided by specifying an appropriate method for the intersection case ```julia julia> g(x::Float64, y::Float64) = 2x + 2y; g (generic function with 3 methods) julia> g(2.0, 3.0) 10.0 ``` * it is recommended that the disambiguating method be defined first, since otherwise the ambiguity exists, if transiently, until the more specific method is defined ] --- .left-column[ ## Methods ### Dispatch ### Defining Methods ### Method Ambiguities ### Optional & Keyword Arguments ] .right-column[ * optional arguments are implemented as syntax for multiple method definitions ```julia f(a=1,b=2) = a+2b ``` translates to the following three methods ```julia f(a,b) = a+2b f(a) = f(a,2) f() = f(1,2) ``` * calling `f()` is equivalent to calling `f(1,2)`, whose result is `5`, because `f(1,2)` invokes the first method of `f` above * if we define a fourth method that is more specialized, this is no longer the case ```julia f(a::Int, b::Int) = a-2b ``` now the result of both `f()` and `f(1,2)` is `-3`` * optional arguments are tied to a function, not to a specific method of that function * it depends on the types of the optional arguments which method is invoked * when optional arguments are defined in terms of a global variable, the type of the optional argument may even change at run-time * keyword arguments behave quite differently from ordinary positional arguments; in particular, they do not participate in method dispatch * methods are dispatched based only on positional arguments, with keyword arguments processed after the matching method is identified ] --- .left-column[ ## Methods ### Dispatch ### Defining Methods ### Method Ambiguities ### Optional & Keyword Arguments ### Function-like Objects ] .right-column[ * methods are associated with types, so it is possible to make any arbitrary Julia object "callable" by adding methods to its type * such "callable" objects are sometimes called "functors" * example: define a type that stores the coefficients of a polynomial, but behaves like a function evaluating the polynomial ```julia struct Polynomial{R} coeffs::Vector{R} end function (p::Polynomial)(x) v = p.coeffs[end] for i in (length(p.coeffs)-1):-1:1 v = v*x + p.coeffs[i] end return v end (p::Polynomial)() = p(0) ``` ```julia julia> p = Polynomial([1,10,100]) Main.##WeaveSandBox#268.Polynomial{Int64}([1, 10, 100]) julia> p(3) 931 julia> p() 1 ``` * this mechanism is the key to how type constructors and closures (inner functions that refer to their surrounding environment) work in Julia ] --- .left-column[ ## Methods ### Dispatch ### Defining Methods ### Method Ambiguities ### Optional & Keyword Arguments ### Function-like Objects ### Empty Generic Functions ] .right-column[ * occasionally it is useful to introduce a generic function without yet adding methods, for example to separate interface definitions from implementations or for the purpose of documentation or code readability * the syntax for this is an empty function block without a tuple of arguments ```julia function emptyfunc end ``` ``` emptyfunc (generic function with 0 methods) ``` ] --- class: center, middle # Constructors --- .left-column[ ## Constructors ] .right-column[ * constructors are functions that create new objects, specifically, instances of composite types * in Julia, type objects also serve as constructor functions: they create new instances of themselves when applied to an argument tuple as a function ```julia julia> struct BarBaz bar baz end julia> bb = BarBaz(1, 2); Main.##WeaveSandBox#268.BarBaz(1, 2) julia> bb.bar 1 julia> bb.baz 2 ``` * for many types, forming new objects by binding their field values together is all that is ever needed to create instances * in some cases more functionality is required when creating composite objects: * sometimes invariants must be enforced, either by checking arguments or by transforming them * recursive data structures often cannot be constructed cleanly without first being created in an incomplete state and then altered programmatically to be made whole, as a separate step from object creation * sometimes, it's just convenient to be able to construct objects with fewer or different types of parameters than they have fields * Julia's system for object construction addresses all of these cases and more ] --- .left-column[ ## Constructors ### Outer Constructor Methods ] .right-column[ * a constructor is just like any other function in Julia in that its overall behaviour is defined by the combined behaviour of its methods * accordingly, you can add functionality to a constructor by simply defining new methods * example: add a constructor method for `BarBaz` objects that takes only one argument and uses the given value for both the `bar` and `baz` fields ```julia julia> BarBaz(x) = BarBaz(x,x); Main.##WeaveSandBox#268.BarBaz julia> BarBaz(1) Main.##WeaveSandBox#268.BarBaz(1, 1) ``` * example: add a zero-argument `BarBaz` constructor method that supplies default values for both of the `bar` and `baz` fields ```julia julia> BarBaz() = BarBaz(0); Main.##WeaveSandBox#268.BarBaz julia> BarBaz() Main.##WeaveSandBox#268.BarBaz(0, 0) ``` * here, the zero-argument constructor method calls the single-argument constructor method, which calls the automatically provided two-argument constructor method * additional constructor methods declared as normal methods like this are called outer constructor methods * outer constructor methods can only ever create a new instance by calling another constructor method, such as the automatically provided default ones ] --- .left-column[ ## Constructors ### Outer Constructor Methods ### Inner Constructor Methods ] .right-column[ * while outer constructor methods succeed in addressing the problem of providing additional convenience methods for constructing objects, they fail to enforce invariants, and to allow the construction of self-referential objects * for these problems, in Julia one needs an inner constructor method, which is like an outer constructor method, except for two differences: * it is declared inside the block of a type declaration, rather than outside of it like normal methods * it has access to a special locally existent function called `new` that creates objects of the block's type * example: declare a type that holds a pair of real numbers, subject to the constraint that the first number is not greater than the second one ```julia struct OrderedPair x::Real y::Real OrderedPair(x,y) = x > y ? error("out of order") : new(x,y) end ``` * `OrderedPair` objects can only be constructed such that `x ≤ y` ```julia julia> OrderedPair(1, 2) Main.##WeaveSandBox#268.OrderedPair(1, 2) julia> OrderedPair(2, 1) Error: out of order ``` * if the type were declared mutable, you could reach in and directly change the field values to violate this invariant (obviously that is bad practice) ] --- .left-column[ ## Constructors ### Outer Constructor Methods ### Inner Constructor Methods ] .right-column[ * if any inner constructor method is defined, no default constructor method is provided: it is presumed that all the needed inner constructors are supplied * the default constructor is equivalent to an inner constructor method that takes all of the object's fields as parameters (constrained to be of the correct type, if the corresponding field has a type), and passes them to new, returning the resulting object ```julia struct BarBaz bar baz BarBaz(bar,baz) = new(bar,baz) end ``` * this declaration has the same effect as the earlier definition of the `BarBaz` type without an explicit inner constructor method * it is good practice to provide as few inner constructor methods as possible: only those taking all arguments explicitly and enforcing essential error checking and transformation * additional convenience constructor methods, supplying default values or auxiliary transformations, should be provided as outer constructors that call the inner constructors to do the heavy lifting ] --- .left-column[ ## Constructors ### Outer Constructor Methods ### Inner Constructor Methods ### Incomplete Initialization ] .right-column[ * the final problem which has still not been addressed is construction of self-referential objects, or more generally, recursive data structures * to explain the fundamental difficulty, let us consider the following recursive type declaration ```julia mutable struct SelfReferential obj::SelfReferential end ``` * this type may appear innocuous enough, until one considers how to construct an instance of it: if `a` is an instance of `SelfReferential`, then a second instance can be created by the call ```julia b = SelfReferential(a) ``` But how does one construct the first instance when no instance exists to provide as a valid value for its `obj` field? * the only solution is to allow creating an incompletely initialized instance of `SelfReferential` with an unassigned `obj` field, and using that incomplete instance as a valid value for the `obj` field of another instance, such as, for example, itself ] --- .left-column[ ## Constructors ### Outer Constructor Methods ### Inner Constructor Methods ### Incomplete Initialization ] .right-column[ * to allow for the creation of incompletely initialized objects, the `new` function may be called with fewer than the number of fields that the type has, returning an object with the unspecified fields uninitialized * the inner constructor method can then use the incomplete object, finishing its initialization before returning it * example: define the SelfReferential type using a zero-argument inner constructor returning instances having `obj` fields pointing to themselves ```julia mutable struct SelfReferential obj::SelfReferential SelfReferential() = (x = new(); x.obj = x) end ``` ```julia julia> x = SelfReferential(); Main.##WeaveSandBox#268.SelfReferential(Main.##WeaveSandBox#268.SelfReferential(#= circular reference @-1 =#)) julia> x === x true julia> x === x.obj true julia> x === x.obj.obj true ``` ] --- .left-column[ ## Constructors ### Outer Constructor Methods ### Inner Constructor Methods ### Incomplete Initialization ] .right-column[ * it is possible to return incompletely initialized objects from an inner constructor (although it is generally a good idea to return a fully initialized object) ```julia mutable struct Incomplete data Incomplete() = new() end ``` ```julia julia> z = Incomplete(); Main.##WeaveSandBox#268.Incomplete(#undef) ``` * while you are allowed to create objects with uninitialized fields, any access to an uninitialized reference is an immediate error ```julia julia> z.data Error: UndefRefError: access to undefined reference ``` * this avoids the need to continually check for `null` values * not all object fields are references: Julia considers some types to be "plain data", meaning all of their data is self-contained and does not reference other objects * the plain data types consist of primitive types (e.g. `Int`) and immutable structs or arrays of other plain data types; their initial content is undefined ```julia struct HasPlain n::Int HasPlain() = new() end ``` ```julia julia> HasPlain() Main.##WeaveSandBox#268.HasPlain(5386732336) ``` ] --- class: center, middle # Modules --- .left-column[ ## Modules ] .right-column[ * in Julia, modules are separate variable workspaces, i.e. they introduce a new global scope and allow to create top-level definitions (global variables) without worrying about name conflicts; they are delimited syntactically, inside `module Name ... end` * within a module, importing controls which names from other modules are visible, and exporting specifies which of your names are intended to be public ```julia module MyModule using LinearAlgebra using SharedArrays: SharedMatrix export foo bar(x) = 2x foo(a::SharedMatrix) = det(a) .+ inv(a) end ``` * the statement `using LinearAlgebra` means that the `LinearAlgebra` module will be available for resolving names as needed * when a global variable is encountered that has no definition in the current module, the system will search for it among variables exported by `LinearAlgebra` and import it if it is found there (such as `det` and `inv`) * the statement `using SharedArrays: SharedMatrix` brings just the identifier `SharedMatrix` from module `SharedArrays` into the scope of `MyModule` * the module defines two functions `foo` and `bar`, with `foo` being exported, and thus available for importing into other modules, and `bar` being private ] --- .left-column[ ## Modules ] .right-column[ * Julia also has the `import` keyword, which supports the same syntax as `using`, however, it does not add modules to be searched the way using does * once a variable is made visible via `using` or `import`, a module may not create its own variable with the same name * imported variables are read-only; assigning to a global variable always affects a variable owned by the current module, or else raises an error ```julia julia> module A1 a = 1 # a global in A1's scope end; Main.##WeaveSandBox#268.A1 julia> module B1 import ..A1 # makes module A1 available A1.a = 2 # changing a variable of an imported module throws below error end; Error: cannot assign variables in other modules ``` * however, a module might provide functions that change its variables, and these can be imported and called from other modules ```julia julia> module A2 a = 1 # a global in A2's scope set_a(x) = a = x end; Main.##WeaveSandBox#268.A2 julia> module B2 import ..A2 # makes module A2 available A2.set_a(2) # change a variable of an imported module via a set-function end; Main.##WeaveSandBox#268.B2 ``` ] --- .left-column[ ## Modules ### Using Modules ] .right-column[ * to understand the differences of the `using` and `import` keywords, consider the following example module with functions `x` and `y` (exported) and `p` (not exported) ```julia module MyModule export x, y x() = "x" y() = "y" p() = "p" end; ``` ``` Main.##WeaveSandBox#268.MyModule ``` * there are several different ways to load the module and its inner functions into the current workspace | Import Command | What is brought into scope | |:------------------------------- |:---------------------------------------- | | `using MyModule` | All `export`ed names (`x` and `y`), | | | `MyModule.x`, `MyModule.y`, `MyModule.p` | | `using MyModule: x, p` | `x` and `p` | | `import MyModule` | `MyModule.x`, `MyModule.y`, `MyModule.p` | | `import MyModule.x, MyModule.p` | `x` and `p` | | `import MyModule: x, p` | `x` and `p` | * functions imported with `using` cannot be extended with new methods; methods can only be added to functions imported with `import` ] --- .left-column[ ## Modules ### Using Modules ### Modules and Files ] .right-column[ * files and file names are mostly unrelated to modules; modules are associated only with module expressions * one can have multiple files per module, and multiple modules per file ```julia module Foo include("file1.jl") include("file2.jl") end ``` * including the same code in different modules provides mixin-like behaviour; one could use this to run the same code with different base definitions, for example testing code by running it with "safe" versions of some operators ```julia module Normal include("mycode.jl") end module Testing include("safe_operators.jl") include("mycode.jl") end ``` ] --- .left-column[ ## Modules ### Using Modules ### Modules and Files ### Paths ] .right-column[ * given the statement `using Foo`, the system consults an internal table of top-level modules to look for one named `Foo` * if the module does not exist, the system attempts to `require(:Foo)`, which typically results in loading code from an installed package * some modules contain submodules, which means you sometimes need to access a non-top-level module * the first way to do this is to use an absolute path, for example `using Base.Sort` * the second way is to use a relative path, which makes it easier to import submodules of the current module or any of its enclosing modules ```julia module Parent module Utils # ... end using .Utils # ... end ``` * the module `Parent` contains a submodule `Utils`, and code in `Parent` wants the contents of `Utils` to be visible; this is done by starting the using path with a period * adding more leading periods moves up additional levels in the module hierarchy; for example `using ..Utils` would look for `Utils` in `Parent`'s enclosing module * note that relative-import qualifiers are only valid in `using` and `import` statements ] --- class: center, middle # Scope of Variables --- .left-column[ ## Scope of Variables ] .right-column[ * certain constructs in the language introduce scope blocks, which are regions of code that are eligible to be the scope of some set of variables * the scope of a variable cannot be an arbitrary set of source lines; instead, it will always line up with one of these blocks * in Julia, there are two main types of scopes, *global scope* and *local scope*, where the latter can be nested * since Julia v1.5 there is a distinction between constructs which introduce a "hard scope" and those which only introduce a "soft scope", which affects whether shadowing a global variable by the same name is allowed or not * scopy constructs: | Construct | Scope type | Allowed within | |:-------------------------------------------- |:------------ |:--------------- | | REPL | global | global | | `module` | global | global | | `struct` | local (soft) | global | | `macro` | local (hard) | global | | `for`, `while`, `try` | local (soft) | global or local | | `let`, functions, comprehensions, generators | local (hard) | global or local | * notably missing from this table are `begin` blocks and `if` blocks, which do not introduce new scopes ] --- .left-column[ ## Scope of Variables ] .right-column[ * Julia uses *lexical scoping*, meaning that a function's scope does not inherit from its caller's scope, but from the scope in which the function was defined. ```julia module Bar x = 1 foo() = x end; ``` ``` Error: invalid redefinition of constant Bar ``` ```julia julia> import .Bar julia> x = -1; -1 julia> Bar.foo() Error: type DataType has no field foo ``` * thus lexical scope means that what a variable in a particular piece of code refers to can be deduced from the code in which it appears alone and does not depend on how the program executes * a scope nested inside another scope can "see" variables in all the outer scopes in which it is contained * outer scopes, on the other hand, cannot see variables in inner scopes * in a scope, each variable can only have one meaning, which is determined regardless of the order of expressions ] --- .left-column[ ## Scope of Variables ### Global Scope ] .right-column[ * Julia does not have an all-encompassing global scope, instead each module introduces a new global scope, separate from the global scope of all other modules * the interactive prompt (REPL) is in the global scope of the module `Main` * modules can introduce variables of other modules into their scope through the `using` or `import` statements or through qualified access using the dot-notation * a module is a namespace as well as a data structure associating names with values * while variable bindings can be read externally, they can only be changed within the module to which they belong ```julia julia> module A a = 1 # a global in A's scope end; Main.##WeaveSandBox#268.A julia> module B module C c = 2 end b = C.c # read a variable of a nested module through a qualified access import ..A # makes module A available d = A.a # reading a variable from an imported module is ok A.a = 2 # changing a variable of an imported module throws below error end; Error: cannot assign variables in other modules julia> module D b = a # errors as D's global scope is separate from A's end; Error: UndefVarError: a not defined ``` ] --- .left-column[ ## Scope of Variables ### Global Scope ### Local Scope ] .right-column[ * a new local scope is introduced by most code blocks (see list above) * in any local scope, writing `local x` declares a new local variable in that scope, regardless of whether there is already a variable named `x` in an outer scope or not * while some programming languages require explicitly declaring new variables before using them, Julia, like many other dynamic languages, considers assignment to a new variable in a local scope to implicitly declare that variable as a new local * specifically, when `x =
` occurs in a local scope, Julia applies the following rules to decide what the expression means based on where the assignment expression occurs and what x already refers to at that location: 1. **Existing local:** If `x` is *already a local variable*, then the existing local `x` is assigned; 2. **Hard scope:** If `x` is *not already a local variable* and assignment occurs inside of any hard scope construct (`let` block, function or macro body, comprehension, or generator), a new local named `x` is created in the scope of the assignment; 3. **Soft scope:** If `x` is *not already a local variable* and all of the scope constructs containing the assignment are soft scopes (loops, `try`/`catch` blocks, or `struct` blocks), the behaviour depends on whether the global variable `x` is defined: * if global `x` is *undefined*, a new local named `x` is created in the scope of the assignment; * if global `x` is *defined*, the assignment is considered ambiguous: * in *non-interactive* contexts (files, eval), an ambiguity warning is printed and a new local is created; * in *interactive* contexts (REPL, notebooks), the global variable `x` is assigned. ] --- .left-column[ ## Scope of Variables ### Global Scope ### Local Scope ] .right-column[ * example: assignment inside of a hard scope, namely a function body, when no local variable by that name already exists ```julia module TestModule1 function greet() x = "hello" # new local println(x) end greet() println(x) end; ``` ``` hello Error: UndefVarError: x not defined ``` * inside of the `greet` function, the assignment `x = "hello"` causes `x` to be a new local variable in the function's scope * the assignment occurs in local scope and there is no existing local x variable * since `x` is local, it doesn't matter if there is a global named `x` or not ] --- .left-column[ ## Scope of Variables ### Global Scope ### Local Scope ] .right-column[ * example: define `x = 123` before defining and calling greet ```julia module TestModule2 x = 123 # global function greet() x = "hello" # new local println(x) end greet() println(x) end; ``` ``` hello 123 Main.##WeaveSandBox#268.TestModule2 ``` * since the `x` in greet is local, the value (or lack thereof) of the global `x` is unaffected by calling `greet` * the hard scope rule doesn't care whether a global named `x` exists or not: assignment to `x` in a hard scope is local (unless `x` is declared global) ] --- .left-column[ ## Scope of Variables ### Global Scope ### Local Scope ] .right-column[ * example: there is already a local variable named `x`, in which case `x =
` always assigns to this existing local `x` * the function `sum_to` computes the sum of the numbers from one up to `n` ```julia module TestModule3 function sum_to(n) s = 0 # new local for i in 1:n s = s + i # assign existing local end return s # same local end println(sum_to(10)) println(s) end; ``` ``` 55 Error: UndefVarError: s not defined ``` * as in the previous example, the first assignment to `s` at the top of `sum_to` causes `s` to be a new local variable in the body of the function * the `for` loop has its own inner local scope within the function scope * at the point where `s = s + i` occurs, `s` is already a local variable, so the assignment updates the existing `s` instead of creating a new local * since `s` is local to the function `sum_to`, calling the function has no effect on the global variable `s` (should it exist) ] --- .left-column[ ## Scope of Variables ### Global Scope ### Local Scope ] .right-column[ * modify the previous example to save the sum `s + i` in a variable `t` before updating `s` ```julia module TestModule4 function sum_to(n) s = 0 # new local for i in 1:n t = s + i # new local `t` s = t # assign existing local `s` end return s, @isdefined(t) end println(sum_to(10)) end; ``` ``` (55, false) Main.##WeaveSandBox#268.TestModule4 ``` * this version returns `s` as before but it also uses the `@isdefined` macro to return a boolean indicating whether there is a local variable named `t` defined in the function's outermost local scope * because of the hard scope rule, there is no `t` defined outside of the `for` loop body * since the assignment to `t` occurs inside of a function, which introduces a hard scope, the assignment causes `t` to become a new local variable in the local scope where it appears, i.e. inside of the loop body * even if there were a global named `t`, it would make no difference: the hard scope rule is not affected by anything in global scope ] --- .left-column[ ## Scope of Variables ### Global Scope ### Local Scope ] .right-column[ * example: move the body of `greet` into a `for` loop, which is soft rather than hard ```julia module TestModule5 for i in 1:3 x = "hello" # new local println(x) end println(x) end; ``` ``` hello hello hello Error: UndefVarError: x not defined ``` * since the global `x` is not defined when the `for` loop is evaluated, the first clause of the soft scope rule applies and `x` is created as local to the `for` loop and therefore global `x` remains undefined after the loop executes ] --- .left-column[ ## Scope of Variables ### Global Scope ### Local Scope ] .right-column[ * example: move the body of `sum_to` into global scope, fixing its argument to `n = 10` ```julia module TestModule6 s = 0 # new global for i in 1:10 t = s + i # new local `t` s = t # assign global `s` end println((s, @isdefined(t))) end; ``` ``` Error: UndefVarError: s not defined ``` * second try: explicitly declare `s` as a global variable inside the `for` loop ```julia module TestModule7 s = 0 # new global for i in 1:10 global s t = s + i # new local `t` s = t # assign global `s` end println((s, @isdefined(t))) end; ``` ``` (55, false) Main.##WeaveSandBox#268.TestModule7 ``` ] --- .left-column[ ## Scope of Variables ### Global Scope ### Local Scope ] .right-column[ * remember: in a scope, each variable can only have one meaning, and that meaning is determined regardless of the order of expressions * in the previous example, the presence of the expression `s = t` in the loop causes `s` to be local to the loop, which means that it is also local when it appears on the RHS of `t = s + i`, even though that expression appears first and is evaluated first ```julia module TestModule8 s = 0 # new global for i in 1:10 t = s + i # new local `t` s = t # assign global `s` end end; ``` ``` Error: UndefVarError: s not defined ``` * one might imagine that the `s` on the first line of the loop could be global while the `s` on the second line of the loop is local, but that is not possible since the two lines are in the same scope block and each variable can only mean one thing in a given scope * if the assignment `s = t` is removed, `s` is unambiguously interpreted as the global `s` ```julia module TestModule9 s = 0 # new global for i in 1:10 t = s + i # new local `t` end end; ``` ``` Main.##WeaveSandBox#268.TestModule9 ``` ] --- .left-column[ ## Scope of Variables ### Global Scope ### Local Scope ] .right-column[ * in Julia v1.5, the REPL uses soft scope for top-level expressions, so that an assignment inside a scope block such as a `for` loop automatically assigns to a global variable if one has been defined already (same behaviour as in a function body) ```julia julia> s = 0; # global julia> for i in 1:10 t = s + i # new local `t` s = t # assign global `s` end julia> s # global 55 ``` * if this code appears in a top-level scope in a file, Julia v1.5 prints an ambiguity warning and throws an undefined variable error ```julia julia> code = """ s = 0 # global for i in 1:10 t = s + i s = t # ambiguous assignment: global or local? end s # global """; julia> include_string(Main, code) ┌ Warning: Assignment to `s` in soft scope is ambiguous because a global variable by the same name exists: `s` will be treated as a new local. Disambiguate by using `local s` to suppress this warning or `global s` to assign to the existing global variable. └ @ string:4 ERROR: LoadError: UndefVarError: s not defined ``` here `include_string` is used to evaluate code as though it were the contents of a file ] --- .left-column[ ## Scope of Variables ### Global Scope ### Local Scope ### Soft Scope ] .right-column[ * a few words should be said about why the ambiguous soft scope case is handled differently in interactive and non-interactive contexts * Why doesn't it just work like the REPL everywhere? * Why doesn't it just work like in files everywhere? * in Julia ≤ v0.6, all global scopes did work like the current REPL: when `x =
` occurred in a loop or `try`/`catch` block, but outside of a function body, a `let` block or a comprehension, whether `x` should be local to the loop was decided based on whether a global named `x` was defined or not * this behaviour is intuitive and convenient since it approximates the behaviour inside of a function body as closely as possible and makes it easy to move code back and forth between a function body and the REPL when trying to debug the behaviour of a function * however, it is bad for programming "at scale" and allows for "spooky action at a distance" — when someone else adds a new global far away, possibly in a different file, the code suddenly changes meaning and either breaks noisily or, worse still, silently does the wrong thing — which is something that good programming language designs should prevent ] --- .left-column[ ## Scope of Variables ### Global Scope ### Local Scope ### Soft Scope ] .right-column[ * the meaning of a small piece of code like the following is quite obvious ```julia s = 0 for i in 1:10 s += i end ``` * the intention is to modify the existing global variable `s` * however, not all real world code is so short or so clear ```julia x = 123 # much later, maybe in a different file for i in 1:10 x = "hello" println(x) end # much later, maybe in yet another file # or maybe back in the first one where `x = 123` y = x + 234 ``` * it is not quite obvious what should happen here, but it seems probable that the intention is for `x` to be local to the `for` loop * with the Julia ≤ v0.6 behaviour, it is especially concerning that someone might have written the `for` loop first, had it working just fine, but later when someone else adds a new global far away the code changes meaning ] --- .left-column[ ## Scope of Variables ### Global Scope ### Local Scope ### Soft Scope ] .right-column[ * in Julia v1.0 the rules for scope were simplified: in any local scope, assignment to a name that wasn't already a local variable created a new local variable * this eliminated the notion of soft scope entirely as well as removing the potential for spooky action and consequently uncovered a significant number of bugs, vindicating the choice to get rid of it * the first example needs to be rewritten accordingly as ```julia s = 0 for i in 1:10 global s += i end ``` * there are two main issues with requiring global for this kind of top-level code: * it is no longer convenient to copy and paste the code from inside a function body into the REPL to debug it—you have to add `global` annotations and then remove them again to go back * beginners will write this kind of code without the `global` and have no idea why their code does not work: the error that they get is that `s` is undefined, which does not seem to enlighten anyone who happens to make this mistake ] --- .left-column[ ## Scope of Variables ### Global Scope ### Local Scope ### Soft Scope ] .right-column[ * as of Julia v1.5, this code works without the `global` annotation in interactive contexts like the REPL or Jupyter notebooks (just like Julia v0.6) and in files and other non-interactive contexts, it prints this very direct warning ```plain Assignment to s in soft scope is ambiguous because a global variable by the same name exists: s will be treated as a new local. Disambiguate by using local s to suppress this warning or global s to assign to the existing global variable. ``` * this addresses both issues while preserving the "programming at scale" benefits of the Julia v1.0 behaviour * in the REPL copy-and-paste debugging works and beginners don't have any issues * any time someone either forgets a `global` annotation or accidentally shadows an existing `global` with a `local` in a soft scope, which would be confusing anyway, they get a nice clear warning * global variables have no spooky effect on the meaning of code that may be far away * an important property of this design is that any code that executes in a file without a warning will behave the same way in a fresh REPL * on the flip side, if you take a REPL session and save it to file, if it behaves differently than it did in the REPL, then you will get a warning ] --- .left-column[ ## Scope of Variables ### Global Scope ### Local Scope ### Soft Scope ### Let Blocks ] .right-column[ * unlike assignments to local variables, `let` statements allocate new variable bindings * an assignment modifies an existing value location, and `let` creates new locations * the `let` syntax accepts a comma-separated series of assignments and variable names ```julia julia> x, y, z = -1, -1, -1; (-1, -1, -1) julia> let x = 1, z println("x: $x, y: $y") # x is local variable, y the global println("z: $z") # errors as z has not been assigned yet but is local end x: 1, y: -1 Error: UndefVarError: z not defined ``` * the assignments are evaluated in order, with each right-hand side evaluated in the scope before the new variable on the left-hand side has been introduced * therefore it makes sense to write something like `let x = x` since the two `x` variables are distinct and have separate storage * since the begin construct does not introduce a new scope, one can use a zero-argument `let` to introduce a new scope block without creating any new bindings ```julia julia> let local x = 1 let local x = 2 end x end 1 ``` ] --- .left-column[ ## Scope of Variables ### Global Scope ### Local Scope ### Soft Scope ### Let Blocks ] .right-column[ * example: create and store two closures that return the variable `i` ```julia julia> Fs = Vector{Any}(undef, 2); i = 1; 1 julia> while i <= 2 Fs[i] = ()->i global i += 1 end julia> Fs[1]() 3 julia> Fs[2]() 3 ``` * the two closures behave identically and return the same variable `i` * we can use `let` to create a new binding for `i` ```julia julia> Fs = Vector{Any}(undef, 2); i = 1; 1 julia> while i <= 2 let i = i Fs[i] = ()->i end global i += 1 end julia> Fs[1]() 1 julia> Fs[2]() 2 ``` ] --- .left-column[ ## Scope of Variables ### Global Scope ### Local Scope ### Soft Scope ### Let Blocks ### Loops ] .right-column[ * in loops and comprehensions, new variables introduced in their body scopes are freshly allocated for each loop iteration, as if the loop body were surrounded by a let block ```julia julia> Fs = Vector{Any}(undef, 2); 2-element Array{Any,1}: #undef #undef julia> for j in 1:2 Fs[j] = ()->j end julia> Fs[1]() 1 julia> Fs[2]() 2 ``` * a for loop or comprehension iteration variable is always a new variable ```julia julia> function f() i = 0 for i in 1:3 # empty end return i end; f (generic function with 4 methods) julia> f() 0 ``` * reusing an existing local variable as the iteration variable can be done by adding the `outer` keyword ] --- .left-column[ ## Scope of Variables ### Global Scope ### Local Scope ### Soft Scope ### Let Blocks ### Loops ### Constants ] .right-column[ * the keyword `const` is used to declare global variables whose values will not change ```julia julia> const α = 10 10 ``` * multiple variables can be declared within a single `const` ```julia julia> const β, γ = 7, 11 (7, 11) ``` * `const` only applies to one `=` operation, therefore ```julia julia> const μ = ν = 1 1 ``` declares `μ` to be constant but not `ν`; on the other hand ```julia julia> const ϕ = const ψ = 1 1 ``` declares both `ϕ` and `ψ` constant * `const` only affects the variable binding but "constant-ness" does not extend into mutable containers (such as an array), which may still be modified * special top-level assignments, such as those performed by the `function` and `struct` keywords, are constant by default * the `const` declaration should only be used in global scope on global variables; local constant declarations are currently not supported ] --- .left-column[ ## Scope of Variables ### Global Scope ### Local Scope ### Soft Scope ### Let Blocks ### Loops ### Constants ] .right-column[ * when one tries to assign a value to a variable that is declared constant the following scenarios are possible * if a new value has a different type than the type of the constant an error is thrown ```julia julia> const c1 = 1.0 1.0 julia> c1 = 1 Error: invalid redefinition of constant c1 ``` * if a new value has the same type as the constant then a warning is printed: ```julia julia> const c2 = 1.0 1.0 julia> c2 = 2.0 2.0 ``` * if an assignment would not result in the change of variable value no message is given ```julia julia> const c3 = 100 100 julia> c3 = 100 100 ``` * note that although sometimes possible, changing the value of a `const` variable is strongly discouraged, and is intended only for convenience during interactive use ] --- class: center, middle # Package Management --- .left-column[ ## Packages ] .right-column[ * Julia's built-in package manager Pkg handles operations such as installing, updating and removing packages * Pkg comes with a REPL, which can be entered by pressing `«closing square bracket»` from the Julia REPL; to get back to the Julia REPL, press backspace or ^C * upon entering the Pkg REPL, you should see a similar prompt ```julia (@v1.4) pkg> ``` * if you are ever stuck, you can ask Pkg for help ```julia (@v1.4) pkg> ? ``` displaying a list of available commands along with short descriptions * you can ask for more detailed help by specifying a command ```julia (@v1.4) pkg> ?develop ``` ] --- .left-column[ ## Packages ### REPL ] .right-column[ * to add a package, use `add` ```julia (@v1.4) pkg> add Example ``` * one can also specify multiple packages at once ```julia (@v1.4) pkg> add StaticArrays OffsetArrays ``` * these are all registered packages; to work with unregistered packages specify a URL ```julia (@v1.4) pkg> add https://github.com/DDMGNI/VortexCollisions.jl ``` or a local path ```julia (@v1.4) pkg> add ../VortexCollisions.jl ``` * a specific version can be installed by appending a version after a `@` symbol ```julia (@v1.4) pkg> add Example@0.4 ``` * if a branch (or a certain commit) of `Example` has a hotfix that is not yet included in a registered version, we can explicitly track that branch (or commit) by appending `#branchname` (or #commitSHA1) to the package name ```julia (@v1.4) pkg> add Example#master ``` * to go back to tracking the registry version of `Example`, the command `free` is used ```julia (@v1.4) pkg> free Example ``` ] --- .left-column[ ## Packages ### REPL ] .right-column[ * to remove packages, use `rm` ```julia (@v1.4) pkg> rm StaticArrays OffsetArrays ``` * also unregistered packages can be removed by name ```julia (@v1.4) pkg> rm VortexCollisions ``` * use `update` to update an installed package ```julia (@v1.4) pkg> update Example ``` * to update all installed packages, use `update` without any arguments ```julia (@v1.4) pkg> update ``` * use `status` to list the packages you have added yourself ```julia (@v1.4) pkg> status ``` * some packages are registred in private registries; to make such packages availabe, Pkg must be informed about the registry by `registry add` ```julia (@v1.4) pkg> registry add https://github.com/FramefunVC/FrameFunRegistry ``` and removed by `registry rm` ```julia (@v1.4) pkg> registry rm FrameFunRegistry ``` ] --- .left-column[ ## Packages ### REPL ### Environments ] .right-column[ * Pkg offers significant advantages over traditional package managers by organizing dependencies into environments * the `(@v1.4)` in the REPL prompt lets us know that `@v1.4` is the active environment that will be modified by Pkg commands such as `add`, `rm` and `update` * the active environment is set using `activate`; if the environment does not exist a new environment is set up ```julia (@v1.4) pkg> activate tutorial Activating new environment at `/tmp/tutorial/Project.toml`. ``` * Pkg lets us know that we are creating a new environment whose project file is `/tmp/tutorial/Project.toml` * a project file is where Pkg stores metadata for an environment * Pkg has also updated the REPL prompt in order to reflect the new active environment ```julia (tutorial) pkg> ``` * we can ask for information about the active environment by using `status` ```julia (tutorial) pkg> status Status `/tmp/tutorial/Project.toml` (empty environment) ``` ] --- .left-column[ ## Packages ### REPL ### Environments ] .right-column[ * the new environment is empty, so let us add a package ```julia (tutorial) pkg> add Example Updating registry at `~/.julia/registries/General` Updating git-repo `https://github.com/JuliaRegistries/General` Resolving package versions... Installed Example ─ v0.5.3 Updating `~/DataShare/Talks/2020/Julia Patterns/tutorial/Project.toml` [7876af07] + Example v0.5.3 Updating `~/DataShare/Talks/2020/Julia Patterns/tutorial/Manifest.toml` [7876af07] + Example v0.5.3 ``` * Pkg updates the General registry, installes the latest version of the `Example` package and updates the `Project.toml` and `Manifest.toml` files accordingly * the `Manifest.toml` file stores versions and sources of all packages installed in the environment, including manually added packages as well as automatically installed dependencies * given the `Project.toml` and `Manifest.toml` files, a copy of an environment in the exact same state can be instantiated anywhere using the `instantiate` command ```julia (@v1.4) pkg> activate tutorial (tutorial) pkg> instantiate ``` * running `activate` with no arguments returns to the default environment ```julia (tutorial) pkg> activate Activating environment at `~/.julia/environments/v1.4/Project.toml` (@v1.4) pkg> ``` ] --- class: center, middle # Performance Tips --- .left-column[ ## Performance Tips ] .right-column[ * a useful tool for measuring performance is the `@time` macro, which reports runtimes and memory allocations during a function call ```julia julia> x = rand(10000); 10000-element Array{Float64,1}: 0.21665005903496604 0.09189346255646824 0.46190201246037654 0.7349987832101073 0.696800854817162 0.08508144895098013 0.6971484663410128 0.5722100254133611 0.010921315166238621 0.02527832533718466 ⋮ 0.6623632054806223 0.5164345186575676 0.4216633909410874 0.45738941824226687 0.610731827657478 0.9524644685278707 0.27041301636640136 0.8423974780764754 0.4962703938525497 julia> function mysum() s = zero(eltype(x)) for i in x s += i end return s end; mysum (generic function with 1 method) julia> @time mysum(); 0.000924 seconds (39.49 k allocations: 773.281 KiB) 5013.43035736099 julia> @time mysum(); 0.001050 seconds (39.49 k allocations: 773.281 KiB) 5013.43035736099 ``` * on the first call the function gets compiled - this is sometimes referred to as warmup - you should not take the results of this run seriously * on the first call of `@time`, it will also compile functions needed for timing * in the second run, it will report the actual run time and memory allocations * unexpected memory allocation is almost always a sign of some problem with your code, usually a problem with type-stability or creating many small temporary arrays; consequently, in addition to the allocation itself, it is very likely that the code generated for your function is far from optimal * take such indications seriously and follow the advice below ] --- ``` Error: ArgumentError: Package BenchmarkTools not found in current path: - Run `import Pkg; Pkg.add("BenchmarkTools")` to install the BenchmarkTools package. ``` .left-column[ ## Performance Tips ] .right-column[ * the *BenchmarkTools.jl* package makes performance tracking of Julia code even easier * it provides the `@btime` macro, which is similar to `@time` but takes care of warmup ```julia julia> @btime mysum(); Error: LoadError: UndefVarError: @btime not defined in expression starting at none:1 ``` * the `@benchmark` macro provides more comprehensive out ```julia julia> @benchmark mysum() Error: LoadError: UndefVarError: @benchmark not defined in expression starting at none:1 ``` * the `@benchmark` macro usually runs several samples of the function to evaluate in order to mitigate benchmark noise and obtain reasonable and consistent performance predictions ] --- .left-column[ ## Performance Tips ### Global Variables ] .right-column[ * a global variable might have its value, and therefore its type, change at any point * this makes it difficult for the compiler to optimize code using global variables * variables should be local, or passed as arguments to functions, whenever possible * code that is performance critical or being benchmarked should be inside a function * globals are often constants; declaring them as such greatly improves performance ```julia julia> const DEFAULT_VAL = 0; 0 ``` * uses of non-constant globals can be optimized by annotating their types when used ```julia global x = rand(1000); function loop_over_global_annotated() s = 0.0 for i in x::Vector{Float64} s += i end return s end; ``` ``` loop_over_global_annotated (generic function with 1 method) ``` * passing arguments to functions is better style and leads to more reusable code ```julia function loop_over_global(x) s = 0.0 for i in x s += i end return s end; ``` ``` loop_over_global (generic function with 1 method) ``` ] --- .left-column[ ## Performance Tips ### Global Variables ] .right-column[ ```julia function loop_over_global_simple() s = 0.0 for i in x s += i end return s end; function loop_over_global_annotated() s = 0.0 for i in x::Vector{Float64} s += i end return s end; function loop_over_global(x) s = 0.0 for i in x s += i end return s end; ``` ``` loop_over_global (generic function with 1 method) ``` ```julia julia> @btime loop_over_global_simple(); Error: LoadError: UndefVarError: @btime not defined in expression starting at none:1 julia> @btime loop_over_global_annotated(); Error: LoadError: UndefVarError: @btime not defined in expression starting at none:1 julia> @btime loop_over_global(x); Error: LoadError: UndefVarError: @btime not defined in expression starting at none:1 ``` ] --- .left-column[ ## Performance Tips ### Global Variables ### Type Stability #### Type-stable Functions ] .right-column[ #### Write "type-stable" functions * the following function might return a value of two types: either the type of `0`, which is an integer (of type `Int`), or the type of `x`, which might be of any type ```julia pos(x) = x < 0 ? 0 : x; ``` ``` pos (generic function with 1 method) ``` * it is easy to ensure that the function always returns a value of the same type ```julia pos(x) = x < 0 ? zero(x) : x; ``` ``` pos (generic function with 1 method) ``` * there is also a `one()` function, and a more general `oftype(x, y)` function, which returns `y` converted to the type of `x` ] --- .left-column[ ## Performance Tips ### Global Variables ### Type Stability #### Type-stable Functions #### Type-stable Variables ] .right-column[ #### Avoid changing the type of a variable * an analogous "type-stability" problem exists for variables used repeatedly within a function ```julia julia> function foo() x = 1 for i in 1:10 x /= rand() end return x end; foo (generic function with 2 methods) ``` * the local variable `x` starts as an integer, and after one loop iteration becomes a floating-point number (the result of `/` operator) * possible solutions: * initialize `x` with `x = 1.0` * declare the type of `x`: `x::Float64 = 1` * use an explicit conversion: `x = one(Float64)` ] --- .left-column[ ## Performance Tips ### Global Variables ### Type Stability #### Type-stable Functions #### Type-stable Variables #### Type Annotations ] .right-column[ #### Annotate values taken from untyped locations * it is often convenient to work with data structures that may contain values of any type (e.g. arrays of type `Array{Any}`) * if you are using such a structure and happen to know the type of an element, it helps to share this knowledge with the compiler ```julia function foo(a::Array{Any,1}) x = a[1]::Int32 b = x+1 # ... end ``` * here, we happened to know that the first element of a would be an `Int32` * making an annotation like this has the added benefit that it will raise a run-time error if the value is not of the expected type, potentially catching certain bugs * if the type of `a[1]` is not known precisely, `x` can be declared via ```julia x = convert(Int32, a[1])::Int32 ``` * the use of the `convert` function allows `a[1]` to be any object convertible to an `Int32`, thus increasing the genericity of the code by loosening the type requirement * notice that `convert` itself needs a type annotation in this context in order to achieve type stability, because the compiler cannot deduce the type of the return value of a function, even `convert`, unless the types of all the function's arguments are known ] --- .left-column[ ## Performance Tips ### Global Variables ### Type Stability #### Type-stable Functions #### Type-stable Variables #### Type Annotations #### Function Barriers ] .right-column[ * many functions follow a pattern of performing some set-up work, and then running many iterations to perform a core computation * these core computations should be put in separate functions * the following contrived function returns an array of a randomly-chosen type ```julia function strange_twos(n) a = Vector{rand(Bool) ? Int64 : Float64}(undef, n) for i in 1:n a[i] = 2 end return a end; ``` * Julia's compiler specializes code for argument types at function boundaries, so it does not know the type of a during the loop (since it is chosen randomly) * separating out the inner loop allows the compiler to specialise for different types ```julia function fill_twos!(a) for i in eachindex(a) a[i] = 2 end end; function strange_twos(n) a = Vector{rand(Bool) ? Int64 : Float64}(undef, n) fill_twos!(a) return a end; ``` * the second form is also often better style and can lead to more code reuse ] --- class: center, middle # Style Guide --- .left-column[ ## Style Guide ] .right-column[ * variable names are in lower case, word separation can be indicated by underscores (`_`), but use of underscores is discouraged unless the name would be hard to read otherwise * names of types and modules begin with a capital letter and word separation is shown with upper camel case instead of underscores * names of functions and macros are in lower case, without underscores * functions that write to their arguments ("mutating" or "in-place" functions) have names that end in ! * conciseness is valued, but avoid abbreviation (`indexin` rather than `indxin`) as it becomes difficult to remember whether and how particular words are abbreviated * don't overuse try-catch: it is better to avoid errors than to rely on catching them * don't parenthesize conditions: Julia does not require parens around conditions in `if` and `while` * avoid using floats for numeric literals in generic code when possible: `f(x) = 2 * x` is less disruptive than `f(x) = 2.0 * x` * further reading: * [Julia Manual: Style Guide](https://docs.julialang.org/en/v1.4-dev/manual/style-guide/) - [Blue: a Style Guide for Julia](https://github.com/invenia/BlueStyle) - [YASGuide: Yet Another Style Guide For Julia](https://github.com/jrevels/YASGuide) ]