In general there seems to be no emerging unified theory of NLP to emerge, and most textbooks and courses explain NLP as
collection of problems, techniques, ideas, frameworks, etc. that really are not tied together in any reasonable way other than the fact that they have to do with NLP.
-- Hal Daume
That's not to say though that they aren't cross-cutting patterns, general best practices and recipes that reoccur frequently. One such reoccurring pattern I found in many NLP papers and systems is what I like to refer to as the structured prediction recipe. But there are recipes that reoccur frequently, such as Structured Prediction
The general goal we address with this recipe is the following. We like to, given some input structure , predict a suitable output structure . In the context of NLP may be the set of document, and a set of document classes (e.g. sports and business). may also be the set of French sentences, and the set of English sentences. In this case each is a structured object (hence the use of bold face). This structured output aspect of the problem has profound consequences on the methods used to address it (as opposed to structure in the input, which one can deal with relatively straight-forwardly). Generally we are also given some training set which may contain input-output pairs , but possibly also just input data (in unsupervised learning), annotated data but for a different task (multi-task, distant, weak supervision, etc.), or some mixture of it.
With the above ingredients the recipe goes as follows:
You will see examples of this recipe throughout the book, as well as frameworks and methods that make this recipe possible. It's worthwhile noting that good NLPers usually combine three skills in accordance with this recipe: 1. modelling, 2. continuous optimization and 3. discrete optimization. For the second and third some basic mathematical background is generally useful, for the first some understanding of the language phenomena you seek to model can be helpful. It's probably fair to say that modelling is the most important bit, and in practice this often shows through the fact that clever features (part of the model) beat clever optimization quite often.
The structured prediction recipe can be found in several places within this book: