adejumobi
Bert Base Multilingual Cased Finetuned Yoruba IR
SentenceTransformer based on Davlan/bert-base-multilingual-cased-finetuned-yoruba This is a sentence-transformers model finetuned from Davlan/bert-base-multilingual-cased-finetuned-yoruba. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. Model Description - Model Type: Sentence Transformer - Base model: Davlan/bert-base-multilingual-cased-finetuned-yoruba - Maximum Sequence Length: 512 tokens - Output Dimensionality: 768 tokens - Similarity Function: Cosine Similarity - Documentation: Sentence Transformers Documentation - Repository: Sentence Transformers on GitHub - Hugging Face: Sentence Transformers on Hugging Face | Metric | Value | |:--------------------|:----------| | cosineaccuracy | 0.865 | | dotaccuracy | 0.135 | | manhattanaccuracy | 0.868 | | euclideanaccuracy | 0.868 | | maxaccuracy | 0.868 | Size: 5,019 training samples Columns: query , pos , and neg Approximate statistics based on the first 1000 samples: | | query | pos | neg | |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------| | type | string | string | string | | details | min: 5 tokens mean: 24.62 tokens max: 74 tokens | min: 6 tokens mean: 24.14 tokens max: 79 tokens | min: 4 tokens mean: 25.71 tokens max: 98 tokens | Samples: | query | pos | neg | |:-------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------| | Kini idi ti Ilu India ṣe a ko ni ọkan lori ijiroro oloselu kan bi ni AMẸRIKA? | Kini idi ti a ko le ni ijiroro gbangba laarin awọn oloselu ni India bi ọkan ninu wa? | Njẹ eniyan le da quo duro de India Pakistan ariyanjiyan?A ni aisan ati ti o ri eyi lojoojumọ ni olopo? | | Kini OnePlus Ọkan? | Bawo ni OnePlus kan? | Kini idi ti OnePlus Ọkan dara? | | Ṣe ọkan wa ṣe iṣakoso awọn ẹdun wa? | Bawo ni ọlọgbọn ati awọn eniyan aṣeyọri ṣe ṣakoso awọn ẹdun wọn? | Bawo ni MO ṣe le ṣakoso awọn ẹdun mi rere fun awọn eniyan ti Mo nifẹ ṣugbọn wọn ko bikita nipa mi? | Loss: TripletLoss with these parameters: Size: 1,000 evaluation samples Columns: query , pos , and neg Approximate statistics based on the first 1000 samples: | | query | pos | neg | |:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------| | type | string | string | string | | details | min: 6 tokens mean: 24.32 tokens max: 94 tokens | min: 6 tokens mean: 24.06 tokens max: 115 tokens | min: 6 tokens mean: 25.58 tokens max: 121 tokens | Samples: | query | pos | neg | |:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------| | Bawo ni o jẹ ọjọ ebi? | Bawo ni o jẹ ọsan | Njẹ NEBM lueMo ṣẹlẹ lati wa awọn ifiweranṣẹ ti o sọ pe o jẹ iro ati pe ko ni itter | | Kini awọn ohun elo akọkọ ti kọnputa kan? | Kini diẹ ninu awọn ẹya akọkọ ti kọnputa kan?Awọn iṣẹ wo ni wọn nṣe iranṣẹ? | Kini awọn eto kọmputa?Kini awọn iṣẹ ti awọn eto kọnputa? | | Ṣe o le faffiti Artists fun sokiri Graffiti ni Rockdale County, GA? | Ṣe o le fun awọn ojukokoro fun fun sokiri Graffiti ni Cockdale County, Georgia? | Kini idi ti Graffiti jẹ arufin? | Loss: TripletLoss with these parameters: Training Hyperparameters Non-Default Hyperparameters - `evalstrategy`: steps - `perdevicetrainbatchsize`: 12 - `perdeviceevalbatchsize`: 3 - `learningrate`: 1e-05 - `numtrainepochs`: 5 - `warmupratio`: 0.1 - `fp16`: True - `batchsampler`: noduplicates - `overwriteoutputdir`: False - `dopredict`: False - `evalstrategy`: steps - `predictionlossonly`: True - `perdevicetrainbatchsize`: 12 - `perdeviceevalbatchsize`: 3 - `pergputrainbatchsize`: None - `pergpuevalbatchsize`: None - `gradientaccumulationsteps`: 1 - `evalaccumulationsteps`: None - `learningrate`: 1e-05 - `weightdecay`: 0.0 - `adambeta1`: 0.9 - `adambeta2`: 0.999 - `adamepsilon`: 1e-08 - `maxgradnorm`: 1.0 - `numtrainepochs`: 5 - `maxsteps`: -1 - `lrschedulertype`: linear - `lrschedulerkwargs`: {} - `warmupratio`: 0.1 - `warmupsteps`: 0 - `loglevel`: passive - `loglevelreplica`: warning - `logoneachnode`: True - `loggingnaninffilter`: True - `savesafetensors`: True - `saveoneachnode`: False - `saveonlymodel`: False - `restorecallbackstatesfromcheckpoint`: False - `nocuda`: False - `usecpu`: False - `usempsdevice`: False - `seed`: 42 - `dataseed`: None - `jitmodeeval`: False - `useipex`: False - `bf16`: False - `fp16`: True - `fp16optlevel`: O1 - `halfprecisionbackend`: auto - `bf16fulleval`: False - `fp16fulleval`: False - `tf32`: None - `localrank`: 0 - `ddpbackend`: None - `tpunumcores`: None - `tpumetricsdebug`: False - `debug`: [] - `dataloaderdroplast`: False - `dataloadernumworkers`: 0 - `dataloaderprefetchfactor`: None - `pastindex`: -1 - `disabletqdm`: False - `removeunusedcolumns`: True - `labelnames`: None - `loadbestmodelatend`: False - `ignoredataskip`: False - `fsdp`: [] - `fsdpminnumparams`: 0 - `fsdpconfig`: {'minnumparams': 0, 'xla': False, 'xlafsdpv2': False, 'xlafsdpgradckpt': False} - `fsdptransformerlayerclstowrap`: None - `acceleratorconfig`: {'splitbatches': False, 'dispatchbatches': None, 'evenbatches': True, 'useseedablesampler': True, 'nonblocking': False, 'gradientaccumulationkwargs': None} - `deepspeed`: None - `labelsmoothingfactor`: 0.0 - `optim`: adamwtorch - `optimargs`: None - `adafactor`: False - `groupbylength`: False - `lengthcolumnname`: length - `ddpfindunusedparameters`: None - `ddpbucketcapmb`: None - `ddpbroadcastbuffers`: False - `dataloaderpinmemory`: True - `dataloaderpersistentworkers`: False - `skipmemorymetrics`: True - `uselegacypredictionloop`: False - `pushtohub`: False - `resumefromcheckpoint`: None - `hubmodelid`: None - `hubstrategy`: everysave - `hubprivaterepo`: False - `hubalwayspush`: False - `gradientcheckpointing`: False - `gradientcheckpointingkwargs`: None - `includeinputsformetrics`: False - `evaldoconcatbatches`: True - `fp16backend`: auto - `pushtohubmodelid`: None - `pushtohuborganization`: None - `mpparameters`: - `autofindbatchsize`: False - `fulldeterminism`: False - `torchdynamo`: None - `rayscope`: last - `ddptimeout`: 1800 - `torchcompile`: False - `torchcompilebackend`: None - `torchcompilemode`: None - `dispatchbatches`: None - `splitbatches`: None - `includetokenspersecond`: False - `includenuminputtokensseen`: False - `neftunenoisealpha`: None - `optimtargetmodules`: None - `batchevalmetrics`: False - `batchsampler`: noduplicates - `multidatasetbatchsampler`: proportional Training Logs | Epoch | Step | Training Loss | loss | cosineaccuracy | |:------:|:----:|:-------------:|:------:|:---------------:| | 0 | 0 | - | - | 0.827 | | 0.2387 | 100 | 4.247 | 3.6056 | 0.815 | | 0.4773 | 200 | 3.3576 | 2.7548 | 0.809 | | 0.7160 | 300 | 2.931 | 2.3805 | 0.843 | | 0.9547 | 400 | 2.4476 | 2.1895 | 0.858 | | 1.1933 | 500 | 2.5839 | 2.1148 | 0.854 | | 1.4320 | 600 | 2.0645 | 2.0497 | 0.855 | | 1.6706 | 700 | 1.8386 | 2.0328 | 0.847 | | 1.9093 | 800 | 1.5527 | 1.9380 | 0.857 | | 2.1480 | 900 | 1.7298 | 1.8999 | 0.861 | | 2.3866 | 1000 | 1.4375 | 1.8744 | 0.855 | | 2.6253 | 1100 | 1.1605 | 1.8761 | 0.861 | | 2.8640 | 1200 | 1.0601 | 1.8658 | 0.862 | | 3.1026 | 1300 | 1.1019 | 1.8181 | 0.861 | | 3.3413 | 1400 | 1.052 | 1.8088 | 0.854 | | 3.5800 | 1500 | 0.8807 | 1.7937 | 0.862 | | 3.8186 | 1600 | 0.7877 | 1.7963 | 0.862 | | 4.0573 | 1700 | 0.7613 | 1.7869 | 0.868 | | 4.2959 | 1800 | 0.8018 | 1.7696 | 0.867 | | 4.5346 | 1900 | 0.6717 | 1.7815 | 0.865 | | 4.7733 | 2000 | 0.6603 | 1.7776 | 0.865 | Framework Versions - Python: 3.10.13 - Sentence Transformers: 3.0.1 - Transformers: 4.41.2 - PyTorch: 2.1.2 - Accelerate: 0.31.0 - Datasets: 2.19.2 - Tokenizers: 0.19.1