Thursday, December 31, 2015
trying a doc2vec example by Pandamonium
from https://github.com/linanqiu/word2vec-sentiments
below is what I tried and got. Before you start, download the test files.
It worked almost out-of-the-box, except for a couple of very minor changes I had to make (highlighted below). The performance is just great. It seems to be the best doc2vec tutorial I've found. Highly recommended.
# gensim modules
from gensim import utils
from gensim.models.doc2vec import LabeledSentence
from gensim.models import Doc2Vec
# numpy
import numpy
# shuffle
from random import shuffle
# logging
import logging
import os.path
import sys
# gensim modules
from gensim import utils
from gensim.models.doc2vec import LabeledSentence
from gensim.models import Doc2Vec
# numpy
import numpy
# shuffle
from random import shuffle
# logging
import logging
import os.path
import sys
try:
import cPickle as pickle
except:
import pickle
program = os.path.basename(sys.argv[0])
logger = logging.getLogger(program)
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s')
logging.root.setLevel(level=logging.INFO)
logger.info("running %s" % ' '.join(sys.argv))
class LabeledLineSentence(object):
def __init__(self, sources):
self.sources = sources
flipped = {}
# make sure that keys are unique
for key, value in sources.items():
if value not in flipped:
flipped[value] = [key]
else:
raise Exception('Non-unique prefix encountered')
def __iter__(self):
for source, prefix in self.sources.items():
with utils.smart_open(source) as fin:
for item_no, line in enumerate(fin):
yield LabeledSentence(utils.to_unicode(line).split(), [prefix + '_%s' % item_no])
def to_array(self):
self.sentences = []
for source, prefix in self.sources.items():
with utils.smart_open(source) as fin:
for item_no, line in enumerate(fin):
self.sentences.append(LabeledSentence(
utils.to_unicode(line).split(), [prefix + '_%s' % item_no]))
return self.sentences
def sentences_perm(self):
shuffle(self.sentences)
return self.sentences
sources = {'test-neg.txt':'TEST_NEG', 'test-pos.txt':'TEST_POS', 'train-neg.txt':'TRAIN_NEG', 'train-pos.txt':'TRAIN_POS', 'train-unsup.txt':'TRAIN_UNS'}
sentences = LabeledLineSentence(sources)
model = Doc2Vec(min_count=1, window=10, size=100, sample=1e-4, negative=5, workers=16)
model.build_vocab(sentences.to_array())
for epoch in range(50):
logger.info('Epoch %d' % epoch)
model.train(sentences.sentences_perm())
model.save('./imdb.d2v')
model = Doc2Vec.load('./imdb.d2v')
model.most_similar('good')
model.docvecs['TRAIN_NEG_0']
train_arrays = numpy.zeros((25000, 100))
train_labels = numpy.zeros(25000)
for i in range(12500):
prefix_train_pos = 'TRAIN_POS_' + str(i)
prefix_train_neg = 'TRAIN_NEG_' + str(i)
train_arrays[i] = model.docvecs[prefix_train_pos]
train_arrays[12500 + i] = model.docvecs[prefix_train_neg]
train_labels[i] = 1
train_labels[12500 + i] = 0
print( train_arrays)
print( train_labels)
test_arrays = numpy.zeros((25000, 100))
test_labels = numpy.zeros(25000)
for i in range(12500):
prefix_test_pos = 'TEST_POS_' + str(i)
prefix_test_neg = 'TEST_NEG_' + str(i)
test_arrays[i] = model.docvecs[prefix_test_pos]
test_arrays[12500 + i] = model.docvecs[prefix_test_neg]
test_labels[i] = 1
test_labels[12500 + i] = 0
classifier = LogisticRegression()
classifier.fit(train_arrays, train_labels)
# classifier
from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression()
classifier.fit(train_arrays, train_labels)
classifier.score(test_arrays, test_labels)
# classifier
from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression()
classifier.fit(train_arrays, train_labels)
I got:
Out[24]:
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
intercept_scaling=1, max_iter=100, multi_class='ovr', n_jobs=1,
penalty='l2', random_state=None, solver='liblinear', tol=0.0001,
verbose=0, warm_start=False)
classifier.score(test_arrays, test_labels)
Out[25]: 0.87831999999999999
This is just great!
Wednesday, December 23, 2015
Gensim: "Modern Methods for Sentiment Analysis by Michael Czerny"
This is purely based on https://districtdatalabs.silvrback.com/modern-methods-for-sentiment-analysis#disqus_thread and the comments on the page. A few small changes were needed so I captured the updates here.
1. Preparation
Download 7z if you don't have it yet from http://www.7-zip.org/download.html .
Download GoogleNews-vectors-negative300.bin.gz from https://code.google.com/p/word2vec/ .
Download IMDB review data from http://bit.ly/1FizNyc .
As suggested in the original webpage, go to http://www.enchantedlearning.com/wordlist/ and collect words for food, sports, and weather, and put the words in food_words.txt, sports_words.txt, and weather_words.txt.
In Ubuntu, you may need to use the C compile by setting up:
2. Test 1
In Spyder IPython window, paste the following
and you should get
1. Preparation
Download 7z if you don't have it yet from http://www.7-zip.org/download.html .
Download GoogleNews-vectors-negative300.bin.gz from https://code.google.com/p/word2vec/ .
Download IMDB review data from http://bit.ly/1FizNyc .
As suggested in the original webpage, go to http://www.enchantedlearning.com/wordlist/ and collect words for food, sports, and weather, and put the words in food_words.txt, sports_words.txt, and weather_words.txt.
In Ubuntu, you may need to use the C compile by setting up:
sudo apt-get install build-essential
2. Test 1
In Spyder IPython window, paste the following
from gensim.models.word2vec import Word2Vec
model = Word2Vec.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True)
and you should get
[('queen', 0.7118191719055176),
You will need at least about 8GB memory. I tried also with 4GB RAM, and it gave the result after more than one hour, which is too slow.
('monarch', 0.6189674139022827),
('princess', 0.5902431011199951),
('crown_prince', 0.5499460697174072),
('prince', 0.5377321243286133)]
You will need at least about 8GB memory. I tried also with 4GB RAM, and it gave the result after more than one hour, which is too slow.
3. Test 2 (a continuation of Test 1)
import numpy as np with open('food_words.txt', 'r', ) as infile: food_words = infile.readlines() with open('sports_words.txt', 'r') as infile: sports_words = infile.readlines() with open('weather_words.txt', 'r') as infile: weather_words = infile.readlines() def getWordVecs(words): vecs = [] for word in words: word = word.replace('\n', '') try: vecs.append(model[word].reshape((1,300))) except KeyError: continue vecs = np.concatenate(vecs) return np.array(vecs, dtype='float') #TSNE expects float type values food_vecs = getWordVecs(food_words) sports_vecs = getWordVecs(sports_words) weather_vecs = getWordVecs(weather_words)
If you run into error for reading the text files (which I encountered in some systems but not always), change to:
import numpy as np with open('food_words.txt', 'r', encoding='utf8') as infile: food_words = infile.readlines() with open('sports_words.txt', 'r', encoding='utf8') as infile:
sports_words = infile.readlines() with open('weather_words.txt', 'r', encoding='utf8') as infile:
weather_words = infile.readlines()
Then
from sklearn.manifold import TSNE import matplotlib.pyplot as plt ts = TSNE(2) reduced_vecs = ts.fit_transform(np.concatenate((food_vecs, sports_vecs, weather_vecs))) #color points by word group to see if Word2Vec can separate them for i in range(len(reduced_vecs)): if i < len(food_vecs): #food words colored blue color = 'b' elif i >= len(food_vecs) and i < (len(food_vecs) + len(sports_vecs)): #sports words colored red color = 'r' else: #weather words colored green color = 'g' plt.plot(reduced_vecs[i,0], reduced_vecs[i,1], marker='o', color=color, markersize=8)
Then you should see a plot of 3 clustered colored dots.
3. Test 3
This is modified from the original tweeter data based test. However, as we don't have tweeter data, we substitute with the pos.txt and neg.txt from the IMDB review data. So this is just for the sake of testing code.
from sklearn.cross_validation import train_test_split from gensim.models.word2vec import Word2Vec with open('pos.txt', 'r', encoding='utf8') as infile:
pos_tweets = infile.readlines() with open('neg.txt', 'r', encoding='utf8') as infile:
neg_tweets = infile.readlines() #use 1 for positive sentiment, 0 for negative y = np.concatenate((np.ones(len(pos_tweets)), np.zeros(len(neg_tweets)))) x_train, x_test, y_train, y_test = train_test_split(np.concatenate((pos_tweets, neg_tweets)), y, test_size=0.2) #Do some very minor text preprocessing def cleanText(corpus): corpus = [z.lower().replace('\n','').split() for z in corpus] return corpus x_train = cleanText(x_train) x_test = cleanText(x_test) n_dim = 300 #Initialize model and build vocab imdb_w2v = Word2Vec(size=n_dim, min_count=10) imdb_w2v.build_vocab(x_train) #Train the model over train_reviews (this may take several minutes) imdb_w2v.train(x_train)
I got an output 8684307.
#Build word vector for training set by using the average value of all word vectors in the tweet, then scale def buildWordVector(text, size): vec = np.zeros(size).reshape((1, size)) count = 0. for word in text: try: vec += imdb_w2v[word].reshape((1, size)) count += 1. except KeyError: continue if count != 0: vec /= count return vec
from sklearn.preprocessing import scale train_vecs = np.concatenate([buildWordVector(z, n_dim) for z in x_train]) train_vecs = scale(train_vecs) #Train word2vec on test tweets imdb_w2v.train(x_test)
I got:
WARNING:gensim.models.word2vec:supplied example count (10000) did not equal expected count (40000) Out[11]: 2172554
#Build test tweet vectors then scale test_vecs = np.concatenate([buildWordVector(z, n_dim) for z in x_test]) test_vecs = scale(test_vecs)
#Use classification algorithm (i.e. Stochastic Logistic Regression) on training set, then assess model performance on test set from sklearn.linear_model import SGDClassifier lr = SGDClassifier(loss='log', penalty='l1') lr.fit(train_vecs, y_train) print( 'Test Accuracy: %.2f'%lr.score(test_vecs, y_test))
I got
Test Accuracy: 0.72
Note that I needed to add parentheses for the last statement to run correctly.
#Create ROC curve
from sklearn.metrics import roc_curve, auc
import matplotlib.pyplot as plt
pred_probas = lr.predict_proba(test_vecs)[:,1]
fpr,tpr,_ = roc_curve(y_test, pred_probas)
roc_auc = auc(fpr,tpr)
plt.plot(fpr,tpr,label='area = %.2f' %roc_auc)
plt.plot([0, 1], [0, 1], 'k--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.legend(loc='lower right')
plt.show()
4. Test 4
import gensim
LabeledSentence = gensim.models.doc2vec.LabeledSentence
from sklearn.cross_validation import train_test_split
import numpy as np
with open('pos.txt','r') as infile:
pos_reviews = infile.readlines()
with open('neg.txt','r') as infile:
neg_reviews = infile.readlines()
with open('unsup.txt','r') as infile:
unsup_reviews = infile.readlines()
#use 1 for positive sentiment, 0 for negative
y = np.concatenate((np.ones(len(pos_reviews)), np.zeros(len(neg_reviews))))
x_train, x_test, y_train, y_test = train_test_split(np.concatenate((pos_reviews, neg_reviews)), y, test_size=0.2)
#Do some very minor text preprocessing
def cleanText(corpus):
punctuation = """.,?!:;(){}[]"""
corpus = [z.lower().replace('\n','') for z in corpus]
corpus = [z.replace('<br />', ' ') for z in corpus]
#treat punctuation as individual words
for c in punctuation:
corpus = [z.replace(c, ' %s '%c) for z in corpus]
corpus = [z.split() for z in corpus]
return corpus
x_train = cleanText(x_train)
x_test = cleanText(x_test)
unsup_reviews = cleanText(unsup_reviews)
#Gensim's Doc2Vec implementation requires each document/paragraph to have a label associated with it.
#We do this by using the LabeledSentence method. The format will be "TRAIN_i" or "TEST_i" where "i" is
#a dummy index of the review.
def labelizeReviews(reviews, label_type):
labelized = []
for i,v in enumerate(reviews):
label = '%s_%s'%(label_type,i)
labelized.append(LabeledSentence(v, [label]))
return labelized
x_train = labelizeReviews(x_train, 'TRAIN')
x_test = labelizeReviews(x_test, 'TEST')
unsup_reviews = labelizeReviews(unsup_reviews, 'UNSUP')
import random
size = 400
#instantiate our DM and DBOW models
model_dm = gensim.models.Doc2Vec(min_count=1, window=10, size=size, sample=1e-3, negative=5, workers=3)
model_dbow = gensim.models.Doc2Vec(min_count=1, window=10, size=size, sample=1e-3, negative=5, dm=0, workers=3)
#build vocab over all reviews
model_dm.build_vocab(np.concatenate((x_train, x_test, unsup_reviews)))
#you may run into error here: "Python int too large to convert to C long." If this occurs, change the hashfxn in Word2Vec constructor __init__: from
self.cbow_mean = int(cbow_mean) self.hashfxn = hashfxn self.iter = iter
to
self.cbow_mean = int(cbow_mean) #self.hashfxn = hashfxn
def hash32(value):
return hash(value) & 0xffffffff
self.hashfxn = hash32
self.iter = iter
https://www.kaggle.com/c/word2vec-nlp-tutorial/forums/t/11197/gensim-word2vec-cython-on-windows/93787
X=x_train + x_test + unsup_reviews
model_dm.build_vocab(X)
model_dbow.build_vocab(X)
On one system (Ubuntu 12.04, 12), it has the following error on
#Get training set vectors from our models
def getVecs(model, corpus, size):
vecs = [np.array(model[z.labels[0]]).reshape((1, size)) for z in corpus]
return np.concatenate(vecs)
train_vecs_dm = getVecs(model_dm, x_train, size)
with the following error: AttributeError: 'LabeledSentence' object has no attribute 'labels'
Checking the other system which it worked, x_train[0].labels = ['TRAIN_0'}, that's why it worked. But on this system, it has tags=['TRAIN_0']. So I changed to
#Get training set vectors from our models
def getVecs(model, corpus, size):
vecs = [np.array(model[z.tags[0]]).reshape((1, size)) for z in corpus]
return np.concatenate(vecs)
train_vecs_dm = getVecs(model_dm, x_train, size)
However, this generated another error:
model_dm[x_train[0]]
File "/home/anaconda3/lib/python3.5/site-packages/gensim-0.12.3-py3.5-linux-x86_64.egg/gensim/models/word2vec.py", line 1293, in <listcomp>
return vstack([self.syn0[self.vocab[word].index] for word in words])
Reverting to gensim 0.10.3 seems to resolve this problem temporarily.
#We pass through the data set multiple times, shuffling the training reviews each time to improve accuracy.
all_train_reviews = np.concatenate((x_train, unsup_reviews))
#if this is too slow, may need to change it range(1) or even range(0), but accuracy would be reduced
for epoch in range(10):
perm = np.random.permutation(all_train_reviews.shape[0])
model_dm.train(all_train_reviews[perm])
model_dbow.train(all_train_reviews[perm])
#Get training set vectors from our models
def getVecs(model, corpus, size):
vecs = [np.array(model[z.labels[0]]).reshape((1, size)) for z in corpus]
return np.concatenate(vecs)
train_vecs_dm = getVecs(model_dm, x_train, size)
train_vecs_dbow = getVecs(model_dbow, x_train, size)
train_vecs = np.hstack((train_vecs_dm, train_vecs_dbow))
#train over test set
x_test = np.array(x_test)
for epoch in range(10):
perm = np.random.permutation(x_test.shape[0])
model_dm.train(x_test[perm])
model_dbow.train(x_test[perm])
#Construct vectors for test reviews
test_vecs_dm = getVecs(model_dm, x_test, size)
test_vecs_dbow = getVecs(model_dbow, x_test, size)
test_vecs = np.hstack((test_vecs_dm, test_vecs_dbow))
from sklearn.linear_model import SGDClassifier
lr = SGDClassifier(loss='log', penalty='l1')
lr.fit(train_vecs, y_train)
print( 'Test Accuracy: %.2f'%lr.score(test_vecs, y_test))
#Create ROC curve
from sklearn.metrics import roc_curve, auc
%matplotlib inline
import matplotlib.pyplot as plt
pred_probas = lr.predict_proba(test_vecs)[:,1]
fpr,tpr,_ = roc_curve(y_test, pred_probas)
roc_auc = auc(fpr,tpr)
plt.plot(fpr,tpr,label='area = %.2f' %roc_auc)
plt.plot([0, 1], [0, 1], 'k--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.legend(loc='lower right')
plt.show()
Sunday, December 20, 2015
Easy installation of Gensim/word2vec in Python
1. Install Anaconda
Go to https://www.continuum.io/downloads and download the installer and install. I tried both Windows 64bit version and Linux 64bit version.
Note that easy_install from https://pypi.python.org/pypi/setuptools is already included.
2. Install gensim
This is mainly based on https://radimrehurek.com/gensim/install.html but simplified.
To install gensim, type
easy_install --upgrade gensim
in Anaconda Prompt in Windows, or in a terminal in Ubuntu.
Another way to install gensim easily is type the following in Anaconda Prompt:
conda install gensim
I tried pip and other methods for gensim, but ran into problems (see below). So the above way is recommended.
To check the packages, type "conda list" and make sure gensim is included.
Other ways to install python and gensim may be more complicated. One reason may be related to C compiler or BLAS/LAPACK is needed.
3. Open Spyder to test.
Type "from gensim.models.word2vec import Word2Vec" in the IPython Console in the lower left corner. If no error is generated, you are ready for gensim and word2vec.
If an older gensim version is needed (e.g., due to the recent update in gensim on LabeledSentence to TaggedDocument), you may want to revert to an old version
pip uninstall gensin
pip install gensim==0.10.3
or
pip install gensim-0.10.3.tar.gz # you need to download the package first
conda install gensim-0.10.3.tar.gz
4. Gensim fast version
In Spyder, you may check if you have the fast version of gensim supported or not. The fast version can have 70x speedup, but a C compiler is needed.
Type
import gensim
gensim.models.word2vec.FAST_VERSION
If you get 1, then you have it. Otherwise, install mingw or MSVC (select visual C++ after installing Visual Studio 2015 Community version) in Windows, or gcc-dev in Ubuntu. Mingw's path needs to be added to system path or user path; likewise for MSVC. Then do "conda uninstall gensim" or "pip uninstall gensim", and do "conda install gensim" or "pip install gensim". After these, try
import gensim
gensim.models.word2vec.FAST_VERSION
and see if you get 1. If not, I found it may be useful to add the following:
[blas]
library_dirs = C:\BLAS
blas_libs = libblas
[lapack]
library_dirs = C:\BLAS
lapack_libs = liblapack
OR
[blas]
library_dirs = C:\BLAS
blas_libs = libblas3
[lapack]
library_dirs = C:\BLAS
lapack_libs = liblapack3
depending on which blas/lapack files you have, into
C:\Users\A\Anaconda3\Lib\site-packages\numpy\distutils\site.cfg
Then try again, and you should get 1.
Monday, December 14, 2015
DL4J Deep Learning for Java installation on Windows
This is purely based on http://deeplearning4j.org/ but selects the steps I followed to install DL4J on Windows. The website deeplearning4j.org provides multiple ways to do but some can be complicated. This is what I have done and it works for me. Hope this is helpful for others who want to do similar things. Please forgive my formatting ...
If you don’t have Java 7 installed on your machine, download the Java Development Kit (JDK) here.
What I downloaded is JDK1.8.0_66. And I added JAVA_HOME as an environment variable. See following screenshots:
After downloading, extract and set PATH variable.
Install MinGW 32 bits even if you have a 64-bit computer (the download button is on the upper right). I also installed a few other packages using the installer.
Download Lapack liblapack3.dll without Intel compiler. Also download libblas3.dll. Put the files under C:\BLAS and add this to PATH.
d:WindowsInfo.bat
Getting data. Please Wait...
--------------------------------------------
Operating System: Microsoft Windows 7 Professional 64-bit
Service Pack: 1
Processor: Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
GPU: Intel(R) HD Graphics 5500
Total Memory: 8080
--------------------------------------------
-- Java --
java version "1.8.0_66"
Java(TM) SE Runtime Environment (build 1.8.0_66-b18)
Java HotSpot(TM) 64-Bit Server VM (build 25.66-b18, mixed mode)
Java home: C:\Program Files\Java\jdk1.8.0_66
--------------------------------------------
-- cl.exe --
INFO: Could not find files for the given pattern(s).
'cl.exe' is not recognized as an internal or external command,
operable program or batch file.
-- vcvars32.bat --
INFO: Could not find files for the given pattern(s).
-- vcvars64.bat --
INFO: Could not find files for the given pattern(s).
-- CUDA --
'nvcc' is not recognized as an internal or external command,
operable program or batch file.
-- GIT --
git version 2.6.3.windows.1
-- Maven --
Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T10:41:4
7-06:00)
Maven home: C:\apache-maven-3.3.9\bin\..
Java version: 1.8.0_66, vendor: Oracle Corporation
Java home: C:\Program Files\Java\jdk1.8.0_66\jre
Default locale: en_US, platform encoding: Cp1252
OS name: "windows 7", version: "6.1", arch: "amd64", family: "dos"
-- OpenBLAS --
libblas3.dll, liblapack3.dll, libopenblas.dll:
C:\blas\libblas3.dll
C:\blas\liblapack3.dll
C:\blas\libopenblas.dll
--------------------------------------------
-- PATH --
C:\ProgramData\Oracle\Java\javapath;C:\windows\system32;C:\windows;C:\windows\Sy
stem32\Wbem;C:\windows\System32\WindowsPowerShell\v1.0\;C:\Program Files\Intel\W
iFi\bin\;C:\Program Files\Common Files\Intel\WirelessCommon\;C:\Program Files (x
86)\Riverbed\Steelhead Mobile\;C:\Program Files (x86)\Skype\Phone\;C:\Program Fi
les (x86)\IVI Foundation\VISA\WinNT\Bin;C:\Program Files\Git\cmd;D:\Tools\miktex\bin\x64\;C:\apache-maven-3.3.9\bi
n;C:\blas;C:\Program Files\Git\bin
Java 7 or above
Java is the main interface and networking language of ND4J, because it’s used for everything from distributed cloud-based systems with thousands of nodes, to low-memory IoT devices. It’s a “write once, run anywhere” language.If you don’t have Java 7 installed on your machine, download the Java Development Kit (JDK) here.
What I downloaded is JDK1.8.0_66. And I added JAVA_HOME as an environment variable. See following screenshots:
To test which version of Java you have (and whether you have it at all), type the following into your command line: java -version:
d:>java -version
java version "1.8.0_66"
Java(TM) SE Runtime Environment (build 1.8.0_66-b18)
Java HotSpot(TM) 64-Bit Server VM (build 25.66-b18, mixed mode)
Maven
Maven is an automated build tool for Java projects (among its other uses). It locates the latest version of ND4J and DL4J project libraries (.jar files) and downloads them automatically. You can find those repositories on Maven Central.After downloading, extract and set PATH variable.
Similar to the setting of JAVA_HOME when installing Java, here we add M2_HOME with value of C:\apache-maven-3.3.9 and M2 with value of %M2_HOME%\bin in User variables.
To check if Maven is installed in your machine, and which version you have, enter the following into the command line:
C:>mvn --version
Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T10:41:4
7-06:00)
Maven home: C:\apache-maven-3.3.9\bin\..
Java version: 1.8.0_66, vendor: Oracle Corporation
Java home: C:\Program Files\Java\jdk1.8.0_66\jre
Default locale: en_US, platform encoding: Cp1252
OS name: "windows 7", version: "6.1", arch: "amd64", family: "dos"
To check if Maven is installed in your machine, and which version you have, enter the following into the command line:
mvn --version
What I got:C:>mvn --version
Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T10:41:4
7-06:00)
Maven home: C:\apache-maven-3.3.9\bin\..
Java version: 1.8.0_66, vendor: Oracle Corporation
Java home: C:\Program Files\Java\jdk1.8.0_66\jre
Default locale: en_US, platform encoding: Cp1252
OS name: "windows 7", version: "6.1", arch: "amd64", family: "dos"
Integrated Development Environment: IntelliJ
An Integrated Development Environment (IDE) will allow you to work with our API and build your nets with a few clicks. The free community edition of IntelliJ has installation instructions. This is straightforward.DL4J example
The example can be downloaded and extracted from
https://github.com/deeplearning4j/dl4j-0.4-examples
Git can be used if Git is installed.
Now open IntelliJ, choose "Import Project", navigate to D:\DL4J\dl4j-0.4-examples-master\ and select the file pom.xml. Select DBNIrisExample.java from the lefthand file tree. Hit run! (It’s the green button that appears when you right-click on the source file…)
The following shows my setting and output in IntelliJ:
To run more examples and do more experiments, it seems more tools are needed.
MinGW
It seems these are sufficient for running the examples. Git can also be installed.
Checking the tools
Running this file, WindowsInfo.bat, can help debug your Windows install. Here’s one example of its output that shows what to expect. First download it, then open a command window / terminal. cd to the directory to which it was dowloaded. Enter WindowsInfo and hit enter. To copy its output, right click on command window -> select all -> hit enter. Output is then on clipboard.
Here is what I got:
d:WindowsInfo.bat
Getting data. Please Wait...
--------------------------------------------
Operating System: Microsoft Windows 7 Professional 64-bit
Service Pack: 1
Processor: Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
GPU: Intel(R) HD Graphics 5500
Total Memory: 8080
--------------------------------------------
-- Java --
java version "1.8.0_66"
Java(TM) SE Runtime Environment (build 1.8.0_66-b18)
Java HotSpot(TM) 64-Bit Server VM (build 25.66-b18, mixed mode)
Java home: C:\Program Files\Java\jdk1.8.0_66
--------------------------------------------
-- cl.exe --
INFO: Could not find files for the given pattern(s).
'cl.exe' is not recognized as an internal or external command,
operable program or batch file.
-- vcvars32.bat --
INFO: Could not find files for the given pattern(s).
-- vcvars64.bat --
INFO: Could not find files for the given pattern(s).
-- CUDA --
'nvcc' is not recognized as an internal or external command,
operable program or batch file.
-- GIT --
git version 2.6.3.windows.1
-- Maven --
Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T10:41:4
7-06:00)
Maven home: C:\apache-maven-3.3.9\bin\..
Java version: 1.8.0_66, vendor: Oracle Corporation
Java home: C:\Program Files\Java\jdk1.8.0_66\jre
Default locale: en_US, platform encoding: Cp1252
OS name: "windows 7", version: "6.1", arch: "amd64", family: "dos"
-- OpenBLAS --
libblas3.dll, liblapack3.dll, libopenblas.dll:
C:\blas\libblas3.dll
C:\blas\liblapack3.dll
C:\blas\libopenblas.dll
--------------------------------------------
-- PATH --
C:\ProgramData\Oracle\Java\javapath;C:\windows\system32;C:\windows;C:\windows\Sy
stem32\Wbem;C:\windows\System32\WindowsPowerShell\v1.0\;C:\Program Files\Intel\W
iFi\bin\;C:\Program Files\Common Files\Intel\WirelessCommon\;C:\Program Files (x
86)\Riverbed\Steelhead Mobile\;C:\Program Files (x86)\Skype\Phone\;C:\Program Fi
les (x86)\IVI Foundation\VISA\WinNT\Bin;C:\Program Files\Git\cmd;D:\Tools\miktex\bin\x64\;C:\apache-maven-3.3.9\bi
n;C:\blas;C:\Program Files\Git\bin
Subscribe to:
Posts (Atom)