prithivida/bert-for-patents-64d

微草AIGC录1年前 (2024)发布 873b2a563b3acc92

Motivation

This model is based on anferico/bert-for-patents – a BERT_LARGE model (See next section for details below). By default, the pre-trained model’s output embeddings with size 768 (base-models) or with size 1024 (large-models). However, when you store Millions of embeddings, this can require quite a lot of memory/storage. So have reduced the embedding dimension to 64 i.e 1/16th of 1024 using Principle Component Analysis (PCA) and it still gives a comparable performance. Yes! PCA gives better performance than NMF. Note: This process neither improves the runtime, nor the memory requirement for running the model. It only reduces the needed space to store embeddings, for example, for semantic search using vector databases.

BERT for Patents

BERT for Patents is a model trained by Google on 100M+ patents (not just US patents).
If you want to learn more about the model, check out the blog post, white paper and GitHub page containing the original TensorFlow checkpoint.

Projects using this model (or variants of it):

Patents4IPPC (carried out by Pi School and commissioned by the Joint Research Centre (JRC) of the European Commission)

收录说明：
1、本网页并非 prithivida/bert-for-patents-64d 官网网址页面，此页面内容编录于互联网，只作展示之用；
2、如果有与 prithivida/bert-for-patents-64d 相关业务事宜，请访问其网站并获取联系方式；
3、本站与 prithivida/bert-for-patents-64d 无任何关系，对于 prithivida/bert-for-patents-64d 网站中的信息，请用户谨慎辨识其真伪。
4、本站收录 prithivida/bert-for-patents-64d 时，此站内容访问正常，如遇跳转非法网站，有可能此网站被非法入侵或者已更换新网址，导致旧网址被非法使用,
5、如果你是网站站长或者负责人，不想被收录请邮件删除：i-hu#Foxmail.com （#换@）

前往AI网址导航

文章版权归作者所有，未经允许请勿转载。

prithivida/bert-for-patents-64d

Motivation

BERT for Patents

Projects using this model (or variants of it):

nichess

SmartScribe

相关文章