程序代写代做代考 android deep learning Word2Vec_Demo-checkpoint
Word2Vec_Demo-checkpoint
COMP6714 Word Embeddings Demonstration using Tensorflow¶
In this notebook, we demonstrate a basic implementation of Word Embbeddings by training the skip-gram model using a small test corpus: Text8 (based on Tensorflow’s word2vec tutorial). It will provide you with hands on experience prior to COMP6714 Project2.
The key part of word embeddings is the training data. In this implementation, mini-batches are generated on demand, and are used by the model to update the word vectors.
You are encouraged to play with this implementation, change model parameters and the training data, and analyze effect of these parameters.
Note that you do not need to dig too deep into the tensorflow and the training part, though if you are into deep learning, it is good to do so.
We evaluate the quality of the embeddings by computing top-10 most nearest words for a sample of words. You will observe that semantically coherent words are embedded close to each other in the embedding space.
Some instructions for this notebook are given as under:
We are using Tensorflow 1.2.1, for this implementation, but this code will work for any version greater than 1.0.
For Tensorflow installation, please follow the link: https://www.tensorflow.org/install/
This notebook will automatically download the training dataset: ‘text8.zip’ for Word Embeddings.
Detailed description of each part is given in corresponding sub-sections.
In next cell, we load all the requisite libraries, and modules we’ll be using in this notebook.
In [1]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import collections
import math
import os
import random
from tempfile import gettempdir
import zipfile
import numpy as np
from six.moves import urllib
import tensorflow as tf
from six.moves import range
from six.moves.urllib.request import urlretrieve
from sklearn.manifold import TSNE
Step 1: Downloading the the dataset, and reading it as a list of words.¶
In [2]:
url = ‘http://mattmahoney.net/dc/’
def download(filename, expected_bytes):
“””Download a file if not present in the working directory, and ensure that the file size is correct.”””
if not os.path.exists(filename):
filename, _ = urlretrieve(url + filename, filename)
statinfo = os.stat(filename)
if statinfo.st_size == expected_bytes:
print(‘Found and verified %s’ % filename)
else:
print(statinfo.st_size)
raise Exception(‘Failed to verify ‘ + filename + ‘. Can you get to it with a browser?’)
return filename
filename = download(‘text8.zip’, 31344016)
print(filename)
# Function to read the data into a list of words.
def read_data(filename):
“””Extract the first file enclosed in a zip file as a list of words.”””
with zipfile.ZipFile(filename) as f:
data = tf.compat.as_str(f.read(f.namelist()[0])).split()
return data
data = read_data(filename)
print(‘Data size’, len(data))
Found and verified text8.zip
text8.zip
Data size 17005207
Step 2: Build the dictionary and replace rare words with UNK token.¶
Next celll reads in the data into four global variables:
data – This variable corresponds to the original text, i.e., the vocabulary of the input corpus. To facilitate processing, we replace the words by corresponding integer ids (i.e., from 0 to vocabulary_size-1).
count – It is a map (word_id x count) used to store the occurences of words in the corpus in order to capture most frequently occuring words.
dictionary – It is a map (word x id) used to store words and their integer ids.
reverse_dictionary – It is a map (id x word) used to store the inverse mapping of the dictionary, to speed the word/id look-up process.
In [3]:
vocabulary_size = 50000 # This variable is used to define the maximum vocabulary size.
def build_dataset(words, n_words):
“””Process raw inputs into a dataset.
words: a list of words, i.e., the input data
n_words: Vocab_size to limit the size of the vocabulary. Other words will be mapped to ‘UNK’
“””
count = [[‘UNK’, -1]]
count.extend(collections.Counter(words).most_common(n_words – 1))
dictionary = dict()
for word, _ in count:
dictionary[word] = len(dictionary)
data = list()
unk_count = 0
for word in words:
index = dictionary.get(word, 0)
if index == 0: # i.e., one of the ‘UNK’ words
unk_count += 1
data.append(index)
count[0][1] = unk_count
reversed_dictionary = dict(zip(dictionary.values(), dictionary.keys()))
return data, count, dictionary, reversed_dictionary
data, count, dictionary, reverse_dictionary = build_dataset(data, vocabulary_size)
# del vocabulary # No longer used, helps in saving space…
print(‘Most common words (+UNK)’, count[:5])
print(‘Sample data’, data[:10], [reverse_dictionary[i] for i in data[:10]])
Most common words (+UNK) [[‘UNK’, 418391], (‘the’, 1061396), (‘of’, 593677), (‘and’, 416629), (‘one’, 411764)]
Sample data [5234, 3081, 12, 6, 195, 2, 3134, 46, 59, 156] [‘anarchism’, ‘originated’, ‘as’, ‘a’, ‘term’, ‘of’, ‘abuse’, ‘first’, ‘used’, ‘against’]
In the next sections, we implement skip-gram model for word embeddings. Details about skip-gram model can be found in following paper: https://arxiv.org/abs/1411.2738.
Step 3: Generating training batch for the skip-gram model.¶
This function is used to generate training batch for the skip-gram model. We are using minibatch stochastic-gradient descent to optimize the objective function. For this, we generate a training batch based on following input parameters: batch_size, num_samples, and skip_window.
batch_size is used to control the size of the minibatch of training data. i.e., it is the number of pairs (CENTER_WORD, CONTEXT_WORD). Note that all CENTER_WORDs are stored and returned in the list batch, and all CONTEXT_WORDs are stored and returned in the list label.
In the skip-gram model, we generate training data by capturing the context words from the surrondings of a target word. Here the parameter: skip_window defines a sliding window size of size (2*skip_window+1), i.e., up to skip_window words to the left and right of the center word.
Within a sliding window, we need to sample num_samples of context words (i.e., excluding the center word). num_samples should be no larger than 2*skip_window.
Example:
As an example, with batch_size= 8, skip_window = 1, and num_samples = 2, it generates following data batch. We consider these words as (input,output) pairs for the skip-gram model.
Text: anarchism originated as a term of abuse first used against early
(CENTER_WORD, CONTEXT_WORD) Pairs:
(originated,anarchism), (originated,as), (as,originated), (as,a), (a,term)
(a,as), (term,of), (term,a)
Explanations:
We choose to always generate num_skip pairs for each center word. Therefore, We only need to consider (8/2) center words.
We start from the second word (so that it has a full-sized sliding window).
With skip_window = 1, num_samples = 2, we generate two samples from two context words (in random order).
The implementation here is a slightly modified version from tensorflow’s word2vec tutorial code; it is still a very ugly implementation.
In [4]:
data_index = 0
# the variable is abused in this implementation.
# Outside the sample generation loop, it is the position of the sliding window: from data_index to data_index + span
# Inside the sample generation loop, it is the next word to be added to a size-limited buffer.
def generate_batch(batch_size, num_samples, skip_window):
global data_index
assert batch_size % num_samples == 0
assert num_samples <= 2 * skip_window
batch = np.ndarray(shape=(batch_size), dtype=np.int32)
labels = np.ndarray(shape=(batch_size, 1), dtype=np.int32)
span = 2 * skip_window + 1 # span is the width of the sliding window
buffer = collections.deque(maxlen=span)
if data_index + span > len(data):
data_index = 0
buffer.extend(data[data_index:data_index + span]) # initial buffer content = first sliding window
print(‘data_index = {}, buffer = {}’.format(data_index, [reverse_dictionary[w] for w in buffer]))
data_index += span
for i in range(batch_size // num_samples):
context_words = [w for w in range(span) if w != skip_window]
random.shuffle(context_words)
words_to_use = collections.deque(context_words) # now we obtain a random list of context words
for j in range(num_samples): # generate the training pairs
batch[i * num_samples + j] = buffer[skip_window]
context_word = words_to_use.pop()
labels[i * num_samples + j, 0] = buffer[context_word] # buffer[context_word] is a random context word
# slide the window to the next position
if data_index == len(data):
buffer = data[:span]
data_index = span
else:
buffer.append(data[data_index]) # note that due to the size limit, the left most word is automatically removed from the buffer.
data_index += 1
print(‘data_index = {}, buffer = {}’.format(data_index, [reverse_dictionary[w] for w in buffer]))
# end-of-for
data_index = (data_index + len(data) – span) % len(data) # move data_index back by `span`
return batch, labels
#
print(‘data[0:10] = {}’.format([reverse_dictionary[i] for i in data[:10]]))
print(‘
.. First batch’)
batch, labels = generate_batch(batch_size=8, num_samples=2, skip_window=1)
for i in range(8):
print(reverse_dictionary[batch[i]], ‘->’, reverse_dictionary[labels[i, 0]])
print(data_index)
print(‘
.. Second batch’)
batch, labels = generate_batch(batch_size=8, num_samples=2, skip_window=1)
for i in range(8):
print(reverse_dictionary[batch[i]], ‘->’, reverse_dictionary[labels[i, 0]])
print(data_index)
data[0:10] = [‘anarchism’, ‘originated’, ‘as’, ‘a’, ‘term’, ‘of’, ‘abuse’, ‘first’, ‘used’, ‘against’]
.. First batch
data_index = 0, buffer = [‘anarchism’, ‘originated’, ‘as’]
data_index = 4, buffer = [‘originated’, ‘as’, ‘a’]
data_index = 5, buffer = [‘as’, ‘a’, ‘term’]
data_index = 6, buffer = [‘a’, ‘term’, ‘of’]
data_index = 7, buffer = [‘term’, ‘of’, ‘abuse’]
originated -> as
originated -> anarchism
as -> originated
as -> a
a -> term
a -> as
term -> of
term -> a
4
.. Second batch
data_index = 4, buffer = [‘term’, ‘of’, ‘abuse’]
data_index = 8, buffer = [‘of’, ‘abuse’, ‘first’]
data_index = 9, buffer = [‘abuse’, ‘first’, ‘used’]
data_index = 10, buffer = [‘first’, ‘used’, ‘against’]
data_index = 11, buffer = [‘used’, ‘against’, ‘early’]
of -> term
of -> abuse
abuse -> of
abuse -> first
first -> abuse
first -> used
used -> against
used -> first
8
–
In [6]:
# Specification of Training data:
batch_size = 128 # Size of mini-batch for skip-gram model.
embedding_size = 128 # Dimension of the embedding vector.
skip_window = 1 # How many words to consider left and right of the target word.
num_samples = 2 # How many times to reuse an input to generate a label.
num_sampled = 64 # Sample size for negative examples.
logs_path = ‘./log/’
# Specification of test Sample:
sample_size = 20 # Random sample of words to evaluate similarity.
sample_window = 100 # Only pick samples in the head of the distribution.
sample_examples = np.random.choice(sample_window, sample_size, replace=False) # Randomly pick a sample of size 16
## Constructing the graph…
graph = tf.Graph()
with graph.as_default():
with tf.device(‘/cpu:0’):
# Placeholders to read input data.
with tf.name_scope(‘Inputs’):
train_inputs = tf.placeholder(tf.int32, shape=[batch_size])
train_labels = tf.placeholder(tf.int32, shape=[batch_size, 1])
# Look up embeddings for inputs.
with tf.name_scope(‘Embeddings’):
sample_dataset = tf.constant(sample_examples, dtype=tf.int32)
embeddings = tf.Variable(tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0))
embed = tf.nn.embedding_lookup(embeddings, train_inputs)
# Construct the variables for the NCE loss
nce_weights = tf.Variable(tf.truncated_normal([vocabulary_size, embedding_size],
stddev=1.0 / math.sqrt(embedding_size)))
nce_biases = tf.Variable(tf.zeros([vocabulary_size]))
# Compute the average NCE loss for the batch.
# tf.nce_loss automatically draws a new sample of the negative labels each
# time we evaluate the loss.
with tf.name_scope(‘Loss’):
loss = tf.reduce_mean(tf.nn.nce_loss(weights=nce_weights, biases=nce_biases,
labels=train_labels, inputs=embed,
num_sampled=num_sampled, num_classes=vocabulary_size))
# Construct the Gradient Descent optimizer using a learning rate of 0.01.
with tf.name_scope(‘Gradient_Descent’):
optimizer = tf.train.GradientDescentOptimizer(learning_rate = 1).minimize(loss)
# Normalize the embeddings to avoid overfitting.
with tf.name_scope(‘Normalization’):
norm = tf.sqrt(tf.reduce_sum(tf.square(embeddings), 1, keep_dims=True))
normalized_embeddings = embeddings / norm
sample_embeddings = tf.nn.embedding_lookup(normalized_embeddings, sample_dataset)
similarity = tf.matmul(sample_embeddings, normalized_embeddings, transpose_b=True)
# Add variable initializer.
init = tf.global_variables_initializer()
# Create a summary to monitor cost tensor
tf.summary.scalar(“cost”, loss)
# Merge all summary variables.
merged_summary_op = tf.summary.merge_all()
Step 5: Begin training¶
In order to execute the model, we initialize a session object using tf.Session(), and call respective node via session.run() or eval(). General workflow for the training process is as under:
Define number of training steps.
Initialize all variables, i.e, embeddings, weights and biases using session.run(init).
Placeholder, (train_inputs, and train_labels) are used to feed input data to skip-gram using the method generate_batch
optimizer, and loss are executed by calling session.run()
Print out average loss after every 5000 iterations
Evaluate the similarity after every 10000 iterations. Look for top-10 nearest neighbours for words in sample set.
In [7]:
num_steps = 130001
with tf.Session(graph=graph) as session:
# We must initialize all variables before we use them.
session.run(init)
summary_writer = tf.summary.FileWriter(logs_path, graph=tf.get_default_graph())
print(‘Initializing the model’)
average_loss = 0
for step in range(num_steps):
batch_inputs, batch_labels = generate_batch(batch_size, num_samples, skip_window)
feed_dict = {train_inputs: batch_inputs, train_labels: batch_labels}
# We perform one update step by evaluating the optimizer op using session.run()
_, loss_val, summary = session.run([optimizer, loss, merged_summary_op], feed_dict=feed_dict)
summary_writer.add_summary(summary, step )
average_loss += loss_val
if step % 5000 == 0:
if step > 0:
average_loss /= 5000
# The average loss is an estimate of the loss over the last 5000 batches.
print(‘Average loss at step ‘, step, ‘: ‘, average_loss)
average_loss = 0
# Evaluate similarity after every 10000 iterations.
if step % 10000 == 0:
sim = similarity.eval() #
for i in range(sample_size):
sample_word = reverse_dictionary[sample_examples[i]]
top_k = 10 # Look for top-10 neighbours for words in sample set.
nearest = (-sim[i, :]).argsort()[1:top_k + 1]
log_str = ‘Nearest to %s:’ % sample_word
for k in range(top_k):
close_word = reverse_dictionary[nearest[k]]
log_str = ‘%s %s,’ % (log_str, close_word)
print(log_str)
print()
final_embeddings = normalized_embeddings.eval()
Initializing the model
Nearest to american: instrument, andechs, akc, kahanamoku, dup, sahara, sandwich, irons, lupus, buffy,
Nearest to that: angelica, wreckin, cannibalism, samba, innodb, clinically, interrogators, fueled, igf, goldsmith,
Nearest to may: journal, asf, insubstantial, attached, cve, platz, rendering, managed, ign, fibonacci,
Nearest to by: dhul, syncopation, manifestation, chekhov, backstreet, colonist, appalachian, pseudopods, dioscorus, bother,
Nearest to into: clusters, sarai, platonic, flintlock, nationalisation, ordering, addressable, handgun, ketchup, niro,
Nearest to first: cleavage, swell, polydor, tiamat, consciously, reentry, bovary, piercing, volumes, ruthenia,
Nearest to while: plucked, pomerania, corundum, gels, ambiguous, supersonic, scrooge, spain, weishaupt, existentialist,
Nearest to such: modem, chill, wine, substandard, necessary, performance, investigates, umami, clermont, downwards,
Nearest to than: hardliners, decrypt, metastability, caracas, airman, intel, foxy, clan, sermons, scientific,
Nearest to called: brahmic, hj, bolster, abodes, vietnamese, modalities, idealist, australes, social, burnham,
Nearest to no: xs, eigenstates, clades, marc, chivalry, traffickers, parsecs, flirty, vipers, empty,
Nearest to some: var, aether, citeaux, expire, bes, buell, mecca, robbie, stylistically, stipulates,
Nearest to would: heralds, prosperous, refusing, rewind, orang, franke, trypomastigotes, eager, fleas, expropriated,
Nearest to world: limburg, predatory, apologies, maynard, rectangles, scorn, munk, yf, oiled, andaman,
Nearest to there: ueshiba, characterise, berlin, intrigue, palatalized, nationale, spoil, attackers, punishable, entrepreneurship,
Nearest to when: kant, schroeder, reactivated, slums, rejoice, swearing, arran, harmful, alen, franz,
Nearest to time: eigenvectors, nuba, nubia, escaped, delusions, tapioca, molars, metallica, bangla, elmira,
Nearest to at: rh, hamlet, rsted, nesting, premiums, arsenal, propped, stacey, capitan, gerhard,
Nearest to not: equinoxes, charmed, aug, palm, gnome, unger, metaphorical, mistakenly, stifle, aegis,
Nearest to one: prevented, rematch, proportions, scilly, sotho, jaffa, trespass, cattle, firewalls, chlorine,
Average loss at step 5000 : 73.3854969249
Average loss at step 10000 : 22.9296655355
Nearest to american: instrument, agave, financial, partnership, simplified, sahara, coral, alcohols, sx, computer,
Nearest to that: and, maguey, vs, alpina, neuron, anthropology, blitz, android, protests, in,
Nearest to may: attached, journal, managed, lists, conversation, vs, rendering, seen, tension, note,
Nearest to by: in, and, is, to, sod, was, manufactures, with, for, coke,
Nearest to into: platonic, camus, clusters, ordering, bus, with, of, stations, verde, locke,
Nearest to first: archie, of, refrigerator, volumes, vs, agave, terra, cca, event, cambodia,
Nearest to while: service, central, dddddd, ambiguous, accessible, desert, spassky, sum, spain, ha,
Nearest to such: the, necessary, modem, performance, a, wine, vs, monotheism, jedwabne, clock,
Nearest to than: and, milne, sod, simplest, intel, prior, scientific, clan, older, phi,
Nearest to called: brahmic, vietnamese, gb, coke, defensive, gradually, social, parties, month, scholar,
Nearest to no: gland, hundreds, neolithic, empty, female, parry, marc, archie, win, harlem,
Nearest to some: var, players, basins, controlling, archie, this, souls, agave, snakes, sees,
Nearest to would: prosperous, ibn, text, quarterback, strategy, to, aberdeen, cases, finalist, gland,
Nearest to world: gb, jpg, volume, trials, execution, predatory, pka, brandy, transcends, pulling,
Nearest to there: berlin, ueshiba, gch, trend, andrew, motile, degree, fbi, vs, good,
Nearest to when: july, harmful, region, none, franz, hiroshima, kant, arran, folded, sure,
Nearest to time: escaped, hold, thought, apparent, nubia, chronic, nuremberg, agave, charges, models,
Nearest to at: in, and, is, with, of, one, coke, macleod, vaughan, three,
Nearest to not: coke, plot, excavation, mistakenly, caught, palm, frame, bin, acres, inherent,
Nearest to one: nine, two, archie, zero, three, gb, vs, phi, six, aberdeenshire,
Average loss at step 15000 : 12.4232016604
Average loss at step 20000 : 8.57796830249
Nearest to american: and, instrument, agave, english, simplified, sx, financial, apologia, b, newly,
Nearest to that: which, and, then, maguey, pigweed, alpina, acacia, vs, it, blitz,
Nearest to may: would, attached, tuva, journal, eight, conversation, lists, diffuse, nine, but,
Nearest to by: in, was, is, and, for, with, five, from, as, to,
Nearest to into: with, at, platonic, and, from, sarai, in, camus, bus, verde,
Nearest to first: archie, of, in, terra, volumes, refrigerator, agave, event, cca, cambodia,
Nearest to while: and, pomerania, when, desert, corundum, central, sociological, is, on, accessible,
Nearest to such: alpaca, modem, necessary, performance, monotheism, circulation, seven, clock, virginity, default,
Nearest to than: and, milne, sod, five, simplest, intel, factions, scientific, antoninus, or,
Nearest to called: vietnamese, brahmic, coke, gb, antwerp, and, dasyprocta, av, homomorphism, defensive,
Nearest to no: hundreds, neolithic, gland, empty, female, integrative, tuning, parry, three, marc,
Nearest to some: the, var, this, citeaux, agouti, its, controlling, players, snakes, charts,
Nearest to would: to, can, prosperous, may, fare, isu, quarterback, cases, text, strategy,
Nearest to world: volume, gb, trials, insights, agouti, jpg, peptide, execution, predatory, pulling,
Nearest to there: it, berlin, they, ueshiba, trend, gch, he, and, punishable, luther,
Nearest to when: was, on, while, alen, harmful, is, in, eight, july, for,
Nearest to time: nubia, escaped, chronic, hold, thought, dasyprocta, nsw, nuremberg, apparent, microscopy,
Nearest to at: in, and, on, with, for, apologia, circ, of, agouti, vaughan,
Nearest to not: coke, to, plot, it, also, caught, excavation, three, mistakenly, bin,
Nearest to one: two, agouti, three, eight, nine, five, seven, dasyprocta, circ, six,
Average loss at step 25000 : 6.96411557055
Average loss at step 30000 : 6.23746108785
Nearest to american: and, instrument, simplified, agave, english, b, barfield, amalthea, financial, overlaps,
Nearest to that: which, bpp, then, it, azad, this, and, drones, pigweed, amalthea,
Nearest to may: would, abet, could, can, tuva, diffuse, attached, eight, nine, but,
Nearest to by: in, with, from, was, for, and, is, as, six, five,
Nearest to into: from, at, with, sarai, in, by, platonic, and, minute, sponsors,
Nearest to first: archie, agave, terra, agouti, otimes, refrigerator, in, cca, current, extrasolar,
Nearest to while: and, pomerania, when, sociological, corundum, central, at, desert, service, weishaupt,
Nearest to such: modem, alpaca, anomalous, well, monotheism, downwards, default, necessary, virginity, circulation,
Nearest to than: and, or, milne, factions, sod, intel, simplest, reactive, caused, scientific,
Nearest to called: and, vietnamese, brahmic, coke, gb, antwerp, dasyprocta, homomorphism, smuggling, gradually,
Nearest to no: hundreds, neolithic, abet, tuning, empty, female, integrative, ever, gland, often,
Nearest to some: abet, var, this, many, the, agouti, its, citeaux, flowered, controlling,
Nearest to would: can, may, to, prosperous, could, isu, had, fare, should, will,
Nearest to world: insights, volume, trials, scorn, gb, pulling, brandy, agouti, peptide, execution,
Nearest to there: it, they, he, berlin, trend, not, ueshiba, frescoes, gch, abet,
Nearest to when: was, while, for, five, eight, in, is, seven, six, otimes,
Nearest to time: tapioca, chronic, nubia, escaped, hold, hotels, delusions, thought, nsw, amalthea,
Nearest to at: in, on, with, and, for, amalthea, apologia, circ, three, agouti,
Nearest to not: to, it, also, coke, caught, bpp, excavation, often, mistakenly, plot,
Nearest to one: two, four, eight, agouti, seven, three, circ, six, amalthea, five,
Average loss at step 35000 : 5.81930053782
Average loss at step 40000 : 5.43187191138
Nearest to american: and, instrument, agave, simplified, english, barfield, amalthea, sandwich, akc, british,
Nearest to that: which, bpp, this, then, but, however, it, when, four, neuron,
Nearest to may: would, can, could, abet, will, tuva, should, diffuse, attached, eight,
Nearest to by: was, with, from, in, is, be, zero, as, and, were,
Nearest to into: from, with, at, sarai, handgun, lucent, minute, under, raided, coke,
Nearest to first: archie, agave, otimes, current, agouti, terra, invaluable, extrasolar, refrigerator, cca,
Nearest to while: and, when, pomerania, sociological, is, on, scrooge, was, corundum, supersonic,
Nearest to such: well, alpaca, modem, monotheism, mosiah, anomalous, downwards, approaching, consoles, circulation,
Nearest to than: or, milne, and, factions, detained, sod, intel, reactive, dennis, five,
Nearest to called: UNK, vietnamese, antwerp, brahmic, coke, gb, and, bolster, smuggling, dasyprocta,
Nearest to no: hundreds, neolithic, abet, tuning, empty, integrative, female, ever, often, a,
Nearest to some: abet, many, this, its, the, agouti, var, flowered, citeaux, these,
Nearest to would: can, may, to, could, will, should, prosperous, isu, eight, renovate,
Nearest to world: insights, volume, scorn, trials, brandy, pulling, gb, predatory, bytes, teamed,
Nearest to there: it, they, he, berlin, not, trend, which, and, gch, frescoes,
Nearest to when: while, was, that, on, is, seven, six, four, eight, five,
Nearest to time: albury, tapioca, chronic, amalthea, thought, hold, hotels, nubia, delusions, chlorophyll,
Nearest to at: in, on, with, amalthea, apologia, and, circ, from, zero, for,
Nearest to not: it, also, to, coke, often, they, bpp, caught, excavation, there,
Nearest to one: two, eight, four, three, six, zero, seven, five, agouti, circ,
Average loss at step 45000 : 5.30632296114
Average loss at step 50000 : 5.12967471261
Nearest to american: and, agave, instrument, english, kennel, simplified, amalthea, british, barfield, french,
Nearest to that: which, this, bpp, then, kapoor, however, but, it, alphorn, when,
Nearest to may: would, can, could, will, might, abet, should, tuva, cannot, eight,
Nearest to by: was, with, in, from, is, be, for, eventually, as, and,
Nearest to into: from, with, at, nationalisation, in, sarai, handgun, under, lucent, sponsors,
Nearest to first: archie, invaluable, next, last, agave, otimes, current, agouti, in, extrasolar,
Nearest to while: and, when, pomerania, sociological, was, but, on, is, spassky, eight,
Nearest to such: well, monotheism, downwards, alpaca, mosiah, modem, anomalous, consoles, known, responsa,
Nearest to than: and, or, milne, factions, detained, sod, reactive, phi, intel, galician,
Nearest to called: UNK, antwerp, gb, brahmic, vietnamese, and, coke, bolster, smuggling, dasyprocta,
Nearest to no: hundreds, ever, neolithic, empty, often, tuning, a, abet, integrative, it,
Nearest to some: many, abet, this, these, its, agouti, flowered, var, the, any,
Nearest to would: can, may, could, will, to, should, isu, prosperous, did, eight,
Nearest to world: insights, scorn, volume, stumbling, pulling, trials, pseudocode, brandy, fida, maynard,
Nearest to there: they, it, he, not, which, berlin, frescoes, abet, gch, trend,
Nearest to when: while, was, eight, for, if, four, six, seven, and, in,
Nearest to time: tapioca, albury, hotels, delusions, amalthea, chlorophyll, hold, battalions, chronic, escaped,
Nearest to at: in, on, with, for, amalthea, three, agouti, apologia, during, and,
Nearest to not: it, also, they, often, you, to, coke, ruth, caught, bpp,
Nearest to one: two, three, four, six, eight, five, seven, agouti, kapoor, circ,
Average loss at step 55000 : 5.10098364229
Average loss at step 60000 : 5.01359458108
Nearest to american: and, british, agave, simplified, english, instrument, kennel, french, barfield, amalthea,
Nearest to that: which, this, bpp, however, then, but, kapoor, when, also, alphorn,
Nearest to may: would, can, could, will, might, should, abet, cannot, tuva, must,
Nearest to by: was, be, with, as, five, eventually, been, from, agouti, against,
Nearest to into: from, nationalisation, at, under, with, sarai, against, lucent, in, handgun,
Nearest to first: archie, next, invaluable, last, agouti, agave, current, second, otimes, event,
Nearest to while: when, and, pulau, pomerania, but, was, sociological, is, acacia, gpo,
Nearest to such: well, downwards, known, mosiah, monotheism, consoles, alpaca, anomalous, modem, responsa,
Nearest to than: or, and, microsite, milne, factions, detained, sod, cooperating, reactive, quiz,
Nearest to called: UNK, michelob, antwerp, and, vietnamese, brahmic, gb, bolster, smuggling, coke,
Nearest to no: a, hundreds, empty, ever, often, it, five, neolithic, appealing, abet,
Nearest to some: many, abet, these, this, its, several, agouti, three, the, two,
Nearest to would: can, may, could, will, to, should, did, isu, had, might,
Nearest to world: insights, umlaut, scorn, stumbling, maynard, pulling, volume, pseudocode, trials, brandy,
Nearest to there: it, they, he, which, not, ursus, abet, often, trend, consoles,
Nearest to when: while, if, six, was, seven, for, eight, four, pulau, is,
Nearest to time: tapioca, sylia, delusions, albury, hotels, battalions, amalthea, michelob, chlorophyll, theron,
Nearest to at: in, on, michelob, during, ursus, amalthea, with, apologia, agouti, six,
Nearest to not: it, they, to, also, often, you, coke, ruth, caught, generally,
Nearest to one: two, six, three, five, ursus, four, agouti, seven, eight, kapoor,
Average loss at step 65000 : 4.87570189486
Average loss at step 70000 : 4.8677266675
Nearest to american: british, simplified, agave, french, instrument, english, kennel, and, barfield, arctos,
Nearest to that: which, this, bpp, kapoor, callithrix, however, then, microcebus, but, upanija,
Nearest to may: would, can, could, will, might, should, must, abet, cannot, tuva,
Nearest to by: was, be, upanija, eventually, thaler, with, as, in, elective, sod,
Nearest to into: from, under, nationalisation, with, at, sarai, against, lucent, mico, on,
Nearest to first: archie, next, second, last, current, agave, invaluable, agouti, otimes, thaler,
Nearest to while: when, and, but, pulau, was, pomerania, sociological, michelob, upanija, although,
Nearest to such: well, downwards, known, mosiah, many, these, consoles, bracing, monotheism, alpaca,
Nearest to than: or, microsite, milne, detained, and, cooperating, factions, sod, thaler, reactive,
Nearest to called: UNK, michelob, antwerp, smuggling, coke, gb, brahmic, ursus, vietnamese, bolster,
Nearest to no: empty, often, hundreds, ever, upanija, logically, another, neolithic, appealing, it,
Nearest to some: many, abet, these, callithrix, several, the, agouti, bracing, this, any,
Nearest to would: can, may, could, will, to, should, might, did, isu, must,
Nearest to world: insights, stumbling, umlaut, scorn, volume, maynard, pulling, pseudocode, trials, customizable,
Nearest to there: they, it, he, still, not, often, callithrix, mico, which, usually,
Nearest to when: while, if, was, before, for, six, upanija, as, four, seven,
Nearest to time: tapioca, delusions, battalions, microcebus, amalthea, hotels, sylia, albury, theron, chlorophyll,
Nearest to at: in, on, michelob, during, callithrix, apologia, three, amalthea, thaler, ursus,
Nearest to not: it, they, often, to, you, generally, also, coke, ruth, bpp,
Nearest to one: six, two, five, four, seven, three, ursus, eight, agouti, microcebus,
Average loss at step 75000 : 4.78121066928
Average loss at step 80000 : 4.78013677206
Nearest to american: british, french, english, simplified, agave, instrument, kennel, and, barfield, thibetanus,
Nearest to that: which, this, however, then, bpp, callithrix, kapoor, but, microcebus, alphorn,
Nearest to may: can, would, could, will, might, should, must, cannot, abet, nine,
Nearest to by: was, be, upanija, eventually, from, thaler, in, with, were, elective,
Nearest to into: from, under, nationalisation, with, against, in, sarai, lucent, mico, sponsors,
Nearest to first: archie, second, last, next, agave, invaluable, agouti, current, otimes, upanija,
Nearest to while: when, and, but, pulau, crb, pomerania, however, although, sociological, michelob,
Nearest to such: well, these, many, known, mosiah, downwards, wraith, consoles, some, alpaca,
Nearest to than: or, and, microsite, milne, detained, cooperating, sod, factions, thaler, but,
Nearest to called: UNK, michelob, antwerp, and, ursus, smuggling, gb, coke, divide, bolster,
Nearest to no: ever, often, hundreds, empty, upanija, it, neolithic, another, agp, appealing,
Nearest to some: many, these, abet, several, callithrix, manure, this, other, bracing, agouti,
Nearest to would: can, may, will, could, to, should, might, must, did, isu,
Nearest to world: insights, scorn, stumbling, umlaut, volume, maynard, pulling, trials, pseudocode, fida,
Nearest to there: they, it, he, which, often, still, not, usually, callithrix, now,
Nearest to when: while, if, before, was, six, however, upanija, as, where, seven,
Nearest to time: tapioca, delusions, battalions, microcebus, amalthea, chlorophyll, theron, ursus, sylia, hotels,
Nearest to at: in, on, during, michelob, amalthea, thaler, polyn, ursus, callithrix, from,
Nearest to not: it, often, generally, they, to, you, also, there, ruth, bpp,
Nearest to one: seven, two, six, five, three, four, ursus, agouti, eight, microcebus,
Average loss at step 85000 : 4.78658102405
Average loss at step 90000 : 4.72901453767
Nearest to american: british, french, english, simplified, instrument, agave, kennel, barfield, thibetanus, amalthea,
Nearest to that: which, however, but, this, bpp, callithrix, then, kapoor, it, microcebus,
Nearest to may: can, would, could, will, might, should, must, cannot, abet, to,
Nearest to by: was, upanija, as, with, be, thaler, eventually, from, foodborne, when,
Nearest to into: from, under, nationalisation, with, through, against, sarai, troll, lucent, mico,
Nearest to first: second, last, archie, next, invaluable, agave, agouti, otimes, only, current,
Nearest to while: when, and, but, pulau, however, although, crb, michelob, pomerania, mitral,
Nearest to such: well, these, many, known, mosiah, downwards, responsa, selamat, some, wraith,
Nearest to than: or, microsite, milne, and, but, detained, cooperating, thaler, factions, sod,
Nearest to called: vert, UNK, michelob, antwerp, ursus, and, gb, pulau, supercar, smuggling,
Nearest to no: often, ever, upanija, a, hundreds, another, empty, any, semiconductors, appealing,
Nearest to some: many, these, abet, several, the, callithrix, manure, other, bracing, this,
Nearest to would: can, may, will, could, should, might, to, must, did, does,
Nearest to world: insights, scorn, stumbling, umlaut, maynard, volume, pulling, trials, customizable, pseudocode,
Nearest to there: they, it, he, often, which, not, mico, still, usually, callithrix,
Nearest to when: if, while, before, for, as, was, where, after, however, during,
Nearest to time: peacocks, battalions, tapioca, delusions, chlorophyll, theron, microcebus, amalthea, hotels, sylia,
Nearest to at: in, on, during, michelob, thaler, ursus, callithrix, amalthea, polyn, from,
Nearest to not: they, often, generally, it, you, there, bpp, also, to, finalist,
Nearest to one: two, four, three, seven, five, eight, six, ursus, crb, agouti,
Average loss at step 95000 : 4.68849103527
Average loss at step 100000 : 4.65652717667
Nearest to american: british, french, english, simplified, instrument, and, agave, kennel, barfield, thibetanus,
Nearest to that: which, however, but, this, bpp, callithrix, then, kapoor, what, microcebus,
Nearest to may: can, would, could, will, might, should, must, cannot, abet, did,
Nearest to by: was, be, upanija, thaler, with, as, eventually, foodborne, gyeongsang, elective,
Nearest to into: from, under, nationalisation, through, with, sponsors, lucent, mico, troll, against,
Nearest to first: second, last, next, archie, agouti, invaluable, current, agave, otimes, upanija,
Nearest to while: when, but, and, although, however, pulau, crb, after, michelob, mitral,
Nearest to such: well, these, many, known, some, mosiah, selamat, downwards, having, responsa,
Nearest to than: or, and, microsite, milne, but, detained, thaler, cooperating, factions, sod,
Nearest to called: vert, and, michelob, antwerp, supercar, ursus, pulau, UNK, smuggling, considered,
Nearest to no: ever, any, upanija, often, another, nine, empty, hundreds, semiconductors, agp,
Nearest to some: many, these, abet, several, callithrix, the, this, manure, any, other,
Nearest to would: can, may, will, could, should, might, must, to, did, does,
Nearest to world: insights, scorn, stumbling, umlaut, pulling, volume, maynard, trials, customizable, peacocks,
Nearest to there: they, it, he, often, now, which, mico, still, usually, callithrix,
Nearest to when: if, while, before, after, where, during, since, as, for, however,
Nearest to time: peacocks, battalions, amalthea, delusions, microcebus, tapioca, chlorophyll, year, hotels, theron,
Nearest to at: in, during, michelob, polyn, thaler, ursus, with, on, callithrix, amalthea,
Nearest to not: generally, they, often, it, to, bpp, also, you, there, ruth,
Nearest to one: six, two, seven, four, five, three, eight, iit, agouti, ursus,
Average loss at step 105000 : 4.62918047304
Average loss at step 110000 : 4.61428171692
Nearest to american: british, english, french, agave, instrument, kennel, simplified, and, thibetanus, barfield,
Nearest to that: which, however, but, bpp, this, callithrix, kapoor, then, when, what,
Nearest to may: can, would, could, will, might, should, must, cannot, abet, did,
Nearest to by: upanija, with, be, was, during, against, thaler, seven, as, without,
Nearest to into: from, under, through, with, nationalisation, troll, sponsors, against, to, lucent,
Nearest to first: second, last, next, archie, agave, current, agouti, invaluable, upanija, otimes,
Nearest to while: when, but, although, however, and, pulau, crb, mitral, after, michelob,
Nearest to such: well, these, many, known, having, some, including, mosiah, downwards, responsa,
Nearest to than: or, and, microsite, milne, but, cooperating, detained, thaler, much, factions,
Nearest to called: vert, michelob, antwerp, ursus, supercar, pulau, bokassa, callithrix, UNK, earns,
Nearest to no: any, ever, another, upanija, empty, semiconductors, agp, hundreds, appealing, often,
Nearest to some: many, these, several, abet, callithrix, the, all, any, manure, other,
Nearest to would: will, can, may, could, might, should, must, did, to, does,
Nearest to world: insights, scorn, stumbling, pulling, umlaut, customizable, maynard, trials, peacocks, volume,
Nearest to there: they, it, he, still, often, which, now, usually, mico, not,
Nearest to when: if, while, before, after, where, however, during, but, although, as,
Nearest to time: peacocks, battalions, delusions, amalthea, year, microcebus, tapioca, kapoor, ursus, hotels,
Nearest to at: in, during, with, from, thaler, on, michelob, polyn, amalthea, ursus,
Nearest to not: generally, often, they, it, bpp, to, you, also, there, ruth,
Nearest to one: seven, two, six, five, four, three, crb, ursus, eight, agouti,
Average loss at step 115000 : 4.5636058845
Average loss at step 120000 : 4.52075491862
Nearest to american: british, english, french, agave, kennel, simplified, adjudged, instrument, barfield, european,
Nearest to that: which, however, bpp, this, kapoor, what, callithrix, but, netbios, alphorn,
Nearest to may: can, would, could, will, might, should, must, cannot, abet, tuva,
Nearest to by: upanija, was, with, be, thaler, through, against, eventually, without, from,
Nearest to into: from, through, under, with, nationalisation, troll, against, sponsors, mico, arctos,
Nearest to first: second, last, next, archie, current, agave, agouti, invaluable, upanija, brassicaceae,
Nearest to while: when, and, although, but, however, pulau, crb, mitral, after, michelob,
Nearest to such: well, these, many, known, some, including, having, responsa, avalanches, mosiah,
Nearest to than: or, but, and, milne, microsite, detained, cooperating, much, thaler, decrypt,
Nearest to called: and, UNK, vert, michelob, ursus, considered, supercar, antwerp, pulau, callithrix,
Nearest to no: any, ever, another, upanija, palomino, empty, semiconductors, appealing, agp, hundreds,
Nearest to some: many, these, several, abet, all, the, any, other, callithrix, manure,
Nearest to would: will, can, could, may, might, should, must, to, did, does,
Nearest to world: insights, scorn, stumbling, umlaut, pulling, customizable, transcends, peacocks, philippines, scripting,
Nearest to there: they, it, he, often, still, now, usually, which, mico, these,
Nearest to when: if, while, before, where, after, however, although, during, but, and,
Nearest to time: peacocks, amalthea, microcebus, battalions, ursus, kapoor, year, aorta, delusions, tapioca,
Nearest to at: in, during, michelob, on, thaler, callithrix, amalthea, abet, sheds, polyn,
Nearest to not: generally, often, they, it, to, you, bpp, still, also, usually,
Nearest to one: three, ursus, six, two, agouti, five, four, seven, iit, callithrix,
Average loss at step 125000 : 4.55231066194
Average loss at step 130000 : 4.52120715604
Nearest to american: british, french, english, agave, kennel, german, simplified, european, adjudged, thibetanus,
Nearest to that: which, however, what, but, bpp, this, callithrix, kapoor, then, alphorn,
Nearest to may: can, would, could, will, might, should, must, cannot, does, did,
Nearest to by: was, be, upanija, with, thaler, through, during, from, without, eventually,
Nearest to into: from, through, under, nationalisation, troll, with, sponsors, against, arctos, mico,
Nearest to first: second, last, next, archie, current, agave, agouti, upanija, invaluable, globemaster,
Nearest to while: when, although, but, and, however, pulau, after, michelob, crb, mitral,
Nearest to such: well, these, many, known, some, including, having, mosiah, responsa, other,
Nearest to than: or, and, but, microsite, milne, much, cooperating, detained, thaler, decrypt,
Nearest to called: UNK, vert, ursus, michelob, supercar, considered, and, antwerp, pulau, callithrix,
Nearest to no: any, upanija, another, semiconductors, ever, palomino, appealing, agp, empty, hundreds,
Nearest to some: many, these, several, abet, all, any, other, the, callithrix, most,
Nearest to would: will, could, can, may, might, should, must, did, does, to,
Nearest to world: insights, scorn, stumbling, pulling, umlaut, customizable, maynard, transcends, philippines, peacocks,
Nearest to there: they, it, he, still, usually, often, now, mico, which, callithrix,
Nearest to when: if, while, before, after, where, during, although, however, but, within,
Nearest to time: peacocks, year, battalions, microcebus, kapoor, amalthea, ursus, aorta, period, rotate,
Nearest to at: in, during, michelob, on, polyn, amalthea, abet, thaler, emulsifiers, under,
Nearest to not: generally, often, it, to, bpp, they, usually, still, you, always,
Nearest to one: seven, six, two, four, eight, agouti, ursus, five, three, microcebus,
Visualize the Graph, and Loss function in TensorBoard¶
Open a terminal in current working directory, and follow these steps:
Run following command: tensorboard –logdir=./log/
Open http://127.0.0.0:6006/ into your web browser.
Visualize the Model Graph under GRAPHS tab.
Visualize plot of loss under SCALARS tab.
In [ ]: