This short Scala crash course is based on the execellent Scala School by Twitter.
We chose Scala as the language for the course (as well as the language we develop our systems in) because of several reasons, some of which are:
In order to install Scala, please follow these instructions.www.scala-lang.org/download/install.html
Starting the interpreter: run scala
from the command line
Programming in IntelliJ: www.jetbrains.com/idea
There is also an Scala IDE based on Eclipse, but we recommend IntelliJ!
Get it here: http://www.scala-sbt.org/
You need to do this if you want to make use of the interactive part of the tutorial and lecture!
git clone https://github.com/uclmr/stat-nlp-book.git; cd stat-nlp-book
git submodule update --init --recursive
sbt compile
cd wolfe; sbt compile; sbt publish-local; cd ..
cp moro/conf/application-statnlpbook.conf moro/conf/application.conf
cd moro; git checkout master; sbt run
open localhost:9000
in your browser
This part of the book concerns the basics of Scala, a quick crash-course of Scala which you will need in order to understand this book, and be able to code up your assignments. All sorts of feedback are welcome and highly appreciated!
You can run these commands either in IntelliJ, or by running
sbt console
or
scala
in your command line, and thus entering Scala's REPL (Read-Evaluate-Print Loop) interpreter.
Almost everything in Scala is an expression, for example:
Numerical Calculations
However, be aware that Scala's automatic type inference doesn't have to work like you want it to:
In this case, having two Integers, Scala infers that it needs to use integer division, which is wrong if you wanted to get a decimal value.
String Operations
Logical expressions
Scala suports values and variables. Values are technically constants, and they cannot be changed (immutable), as opposed to variables which can (mutable). Try removing the commented piece of code to verify that:
You might ask yourself: why should I use values and immutable structure? There are several reasons for and against using them. Immutable structures help with reasoning about the code, concurrency, make the code less prone to bugs (no references to take care of), etc. You can find a couple of thoughts about that here and here. You might also ask yourself: how do I change something in an immutable structure then? Easily - you copy it with a change in place :) However, you will see more in the rest of the tutorial.
If-then-else
For Loop
While Loop
In Scala, functions are objects you create with the keyword def
, e.g.:
As you can see from the definition, you need to specify the type of the parameters, but you can freely omit the output type as the interpreter/compiler will do that implicitly (except in cases of recursive functions). Functions can be stored in variables and passed as parameters, as they are full-fledged Scala objects.
Let's take a look at a couple of functions' capabilities on a small NLP example - let's build something (maybe) useful which depluralizes (removes suffixes of plural forms of) nouns:
They are literally objects!
If you don't want your function to return a value (like void
in C), use Unit
as a return value:
Since functions are objects, we can pass them to functions!
There are different ways of writing functions!
The last expression in the body of a function is its return value. Also, functions without arguments can be called without parenthesis.
In-code TODO statements
Case classes are regular Scala classes which export their constructor parameters and enable you to recursively decompose them with pattern matching. You don't have to write new
!
Pattern matching is the second most used feature of scala. It is a general mechanism which allows you to match on different kinds of data structures.
We can define a factorial function with pattern matching as follows.
Since we have enough understanding of Scala, we can proceed with understanding a Hello World application in it:
The preferred version of starting your programs: