[Coursera] Art and Science of Machine Learning (2)

Coursera 강의 “Machine Learning with TensorFlow on Google Cloud Platform” 중 다섯 번째 코스인 Art and Science of Machine Learning의 강의노트입니다.

Review Embedding

Creating an embedding column from a feature cross.
The weights in the embedding column are learned from data.
The model learns how to embed the feature cross in lower-dimensional space
Embedding a feature cross in TensorFlow

 
import tf.feature_column as fc

day_hr = fc.crossed_column([dayofweek, hourofday], 24*7)

# Transfer Learning of embedding from similar ML models
day_hr_em = fc.embedding_column(day_hr, 2,
		ckpt_to_load_from='london/*ckpt-1000*',
		tensor_name_in_ckpt='dayhr_embed',
		trainable=False
)

Transfer Learning of embeddings from similar ML models
- First layer: the feature cross
- Second layer: a mystery box labeled latent factor
- Third layer: the embedding
- Fourth layer: one side: image of traffic
- Second side: image of people watching TV

Recommendations

Using a second dimension gives us more freedom in organizing movies by similarity
A d-dimensional embedding assumes that user interest in movies can be approximated by d aspects (d < N)

Data-driven Embeddings

We could give the axes names, but it is not essential
Its’ easier to train a model with d inputs than a model with N inputs
Embeddings can be learned from data

Sparse Tensors

Dense representations are inefficient in space and compute
So, use a sparse representation to hold the example
- Build a dictionary mapping each feature to an integer from 0, … # movies -1
- Efficiently represent the sparse vector as just the movies the user watched
Representing feature columns as sparse vectors (These are all different ways to create a categorical column)
- If you know the keys beforehand:

 tf.feature_column.categorical_column_with_vocabulary_list('employeeId',
  	vocabulary_list = ['8345', '72345', '87654', '98723', '23451'])

If your data is already indexed: i.e., has integers in[0-N):

 tf.feature_column.categorical_column_with_identity('employeeId',
  	num_bucket = 5)

If you don’t have a vocabulary of all possible values:

 
tf.feature_column.categorical_column_with_hash_bucket('employeeId',
	hash_bucket_size = 500)

Train an Embedding

Embedding are feature columns that function like layers

 
sparse_word = fc.categorical_column_with_vocabulary_list('word',
	vocabulary_list=englishWords)
embedded_word = fc.embedding_column(sparse_word, 3)

The weights in the embedding layer are learned through backprop just as with other weights
Embeddings can be thought of as latent features.

Similarity Property

Embeddings provides dimensionality reduction.

You can take advantage of this similarity property of embeddings
A good starting point for number of embedding dimensions
- Higher dimensions → more accuracy
- Higher dimensions → overfitting, slow training
- Empirical tradeoff
\[dimensions\approx\sqrt[4]{possible\ values}\]

Custom Estimator

Estimator provides a lot of benefits
Canned Estimators are sometimes insufficient

Suppose that you want to use a model structure from a research paper…
- Implement the model using low-level TensorFlow ops

 def model_from_research_paper(timeseries):
  	x = tf.split(timeseries, N_INPUTS, i)
  	lstm_cell = rnn.BasicLSTMCell(LSTM_SIZE, forget_bias=1.0)
  	outputs, _ = rnn.static_rnn(lstm_cell, x, dtype=tf.float32)
  	outputs = output[-1]
  	weights = tf.Variable(tf.random_normal([LSTM_SIZE, N_OUTPUTS]))
  	bias = tf.Variable(tf.random_normal[N_OUTPUTS]))
  	predictions = tf.matmul(outputs, weights) + bias
  	return predictions

How do we wrap this custom model into Estimator framework?
Create train_and_evaluate function with the base-class Estimator

 
def train_and_evaluate(output_dir, ...):
	estimator = tf.estimators.Estimator(model_fn = myfunc,
		model_dir = output_dir)
	train_spec = get_train()
	exporter = ...
	eval_spec = get_valid()
	tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)

myfunc (above) is a EstimatorSpec.
- The 6 things in a EstimatorSpec
1. Mode is pass-through
2. Any tensors you want to return
3. Loss metric
4. Training op
5. Eval ops
6. Export outputs

 def myfunc(features, targets, mode):
  	# Code up the model
  	predictions = model_from_research_paper(features[INCOL})

  # Set up loss function, training/eval ops
  ... # (next code)

  # Create export outputs
  export_outputs = {"regression_export_outputs":
  tf.estimator.export.RegressionOutput(value = predictions)}
  # Return EstimatorSpec
  return tf.estimator.EstimatorSpec(
  mode = mode,
  predictions = predictions_dict,
  loss = loss,
  train_op = train_op,
  eval_metric_ops = eval_metric_ops,
  export_outputs = export_outputs)

The ops are set up in the appropriate mode

 
if mode == tf.estimator.ModeKeys.TRAIN or
	mode == tf.estimator.ModeKeys.EVAL:
	loss = tf.losses.mean_squared_error(targets, predictions)
	train_op = tf.contrib.layers.optimize_loss(
		loss=loss,
		global_step=tf.contrib.framework.get_global_step(),
		learning_rate=0.01,
		optimizer="SGD")
	eval_metric_ops = {
		"rmse" : tf.metrics.root_mean_squared_error(targets, predictions)}
else:
	loss = None
	train_op = None
	eval_metric_ops = None

Keras Models

Keras is high-level deep neural networks library that supports multiple backends
Keras is easy to use for fast prototyping

 
model = Sequential()
model.add(Embedding(max_features, output_dim=256))
model.add(LSTM(128))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy',
		optimizer='rmsprop',
		metrics=['accuracy'])

model.fit(x_train, y_train, batch_size=16, epochs=10)
score = model.evaluate(x_test, y_test, batch_size=16)

From a compiled Keras model, you can get an Estimator

 
from tensorflow import keras

model = Sequential()
model.add(Embedding(max_features, output_dim=256))
model.add(LSTM(128))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy',
		optimizer='rmsprop',
		metrics=['accuracy'])

# Get estimator from keras
estimator = keras.estimator.model_to_estimator(keras_model=model)

You will use this estimator the way you normally use an estimator

 
def train_and_evaluate(output_dir):
	estimator = make_keras_estimator(output_dir)
	train_spec = tflestimator.TrainSpec(train_fn, max_steps = 1000)
	exporter = LatestExporter('exporter', serving_input_fn)
	eval_spec = tf.estimator.EvalSpec(eval_fn,
				steps = None,
				exporters = exporter)
	tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)

The connection between the input features and Keras is through a naming convention

 
model = keras.models.Sequential()
model.add(keras.layers.Dense(..., name'XYZ'))

def train_input_fn():
	...
	features = {
				'XYZ_input': some_tensor,
			}
	return features, labels