NER model #6278
Unanswered
Bmikaella
asked this question in
Help: Other Questions
NER model
#6278
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi!
I have a question regarding the architecture of the NER models. I have watched the video explanation (https://spacy.io/universe/project/video-spacys-ner-model) but I’m unsure if I understand all of it correctly.
I’m mostly concerned about the CNN part. Here is my understanding of it.
input_vector = [number_of_words x embedding_size]
cnn_layer_n = [3 x embedding_size]
For a sentence of 6 words:
input vector = [v1,v2,v3,v4,v5,v6]
vn is of size [1 x embedding_size]
First, we pad the input_vector to preserve the dimensions. A window (of size 3) is slid across the padded input vector (sliding step is one) and for every three words, max component in each dimension is used. After this, we have obtained a vector of the same size as the input that is a combination of the two surrounding words. The output is added to the input into each CNN layer.
[v1,v2,v3,v4,v5,v6] — padd —> [0,v1,v2,v3,v4,v5,v6,0]
0 denotes a zero vector not a scalar value.
[0,v1,v2,v3,v4,v5,v6,0] —- cnn —> [maxout (0,v1,v2), maxout(v1,v2,v3), maxout(v2,v3,v4), maxout(v3,v4,v5), maxout(v4,v5,v6), maxout(v5,v6,0)]
My question here is about the CNN layer. There is no mention of kernels and neither how many are used. For the dimensions to stay unchanged and for the proposed flow to work are the kernels of size [3 x embeddig_size]? I assume that the number of kernels should be the size of the embedding as [8 X embedding size] goes into the CNN and the output should be [6 x embedding size].
If there are no kernels and we only slide a window of size: [3 x embedding_size] over the input and apply maxout to each component then the proposed flow also works. I’m only confused as the CNN is mentioned but there is no mention of kernels and filters.
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions