python 3.x - Find probability of a string in multiple text files -


i have 2 text files on 500 words each , have string say, x = 'this sample test'
need find probability of each word of string in both files , compare them each other find file has maximum probability contain string

`import nltk import random import math import numpy np nltk.tokenize import word_tokenize tokenize nltk.tokenize import sent_tokenize sentize nltk.corpus import state_union collections import counter collections import defaultdict   #open files bonjovi_text = open('bonjovi.txt').read() edward_text = open('edward.txt').read()  #tokenize files word_tokenizer_bonjovi = tokenize(bonjovi_text.lower()) word_tokenizer_edward = tokenize(edward_text.lower())  #define counter , dictionary priors = counter() classifier = defaultdict(counter)  #call dictionary classifier['bonjovi'] =(counter(word_tokenizer_bonjovi)) classifier['edward'] =(counter(word_tokenizer_edward))   enter = input('enter query:') enter = enter.lower() rlist = enter.split() finallist = np.zeros(2) print (rlist)  #search file 1 word in rlist:      filter_query = (classifier['bonjovi'][(word)])     wordprob_bonjovi = (1+filter_query)/((sum(classifier['bonjovi'].values()))+len(classifier['bonjovi']))     classprob = 1/(len(classifier.keys()))     bon_probabilit = math.log(classprob*wordprob_bonjovi)     previous = finallist[0]     finallist[0] = previous + bon_probabilit   print(np.argmax(finallist))  #search file 2   word in rlist:      search_query = (classifier['edward'][(word)])     wordprob_edward = (1+search_query)/(sum(classifier['edward'].values())+len(classifier['edward']))  #probability of each word in list     classprob2 = 1/(len(classifier.keys()))     edward_probability = math.log(classprob2*wordprob_edward)  #probability of word multiplied class probability     previous = finallist[1]     finallist[1] = previous + edward_probability              print(finallist)  print(np.argmax(finallist))` 

the text 2 files below:

bonjovi: ain't song broken-hearted no silent prayer faith-departed ain't gonna face in crowd you're gonna hear voice when shout out loud

it's life it's or never ain't gonna live forever want live while i'm alive (it's life) heart open highway frankie said did way want live while i'm alive it's life

this ones stood ground it's tommy , gina never backed down tomorrow's getting harder, make no mistake luck ain't enough you've got make own breaks

it's life , it's or never ain't gonna live forever want live while i'm alive (it's life) heart open highway frankie said did way want live while i'm alive it's life

you better stand tall when they're calling out don't bend, don't break, baby, don't down

it's life , it's or never ain't gonna live forever want live while i'm alive (it's life) heart open highway frankie said did way want live while i'm alive (it's life)

edward: i'm dreaming, i'm dreaming out loud i'm searching missing part of heart uuuuu uuuu uuuu, catch me every time fall when eyes know tell me lies

this, life i'm looking searching love in eyes this, life i'm chasing dream fade away in night

can soul, i'll make lose control, i'll sun in night

just come here inside i'm playing i'm falling when eyes know tell me lies

ref: this, life i'm looking searching love in eyes this, life i'm chasing dream fade away in night

i can't follow dreams forever see them fall apart can change world if 'cause know won't let go when eyes know tell me lies

this, life i'm looking searching love in eyes this, life i'm chasing dream fade away in night start miss smile, voice hear i'm chasing hollow eyes, show me i'm wrong tonight this, life i'm looking searching love in eyes this, life i'm chasing dream fade away in night


Comments

Popular posts from this blog

resizing Telegram inline keyboard -

command line - How can a Python program background itself? -

php - "cURL error 28: Resolving timed out" on Wordpress on Azure App Service on Linux -