• Legacy
  • Academy
  • Business
  • PayTech
  • Ignite
  • Cellcos
  • Wired
  • CovidTech
  • Library
  • Touch Base
Subscribe
CW Pakistan

Computerworld Pakistan

CW Pakistan
  • Legacy
  • Academy
  • Business
  • PayTech
  • Ignite
  • Cellcos
  • Wired
  • CovidTech
  • Library
  • Touch Base
  • Uncategorized

UrduHack – the First of Its Kind Open Source Python Library

  • March 21, 2019
  • Sub Editor
Total
0
Shares
0
0
0

Many Natural Language Processing modules have been developed in English and other major languages, to extract useful insights from unstructured data, however not much work has been done for our local languages in Pakistan.

No work has been currently done in NLP with regards to Urdu, and one of the major reason why is that so is the lack of basic tools and a framework to process the Urdu language.

Therefore, in order to change that a Pakistani duo, Ikram Ali and Mujadad Rao, have developed an open source Python library for Urdu called UrduHack. 

Ikram has over 7 years of experience in the software industry and is an avid Machine Learning practitioner with a bachelors in Computer Science from Virtual University, Pakistan. He is currently working as the Principal Software Engineer at Arbisoft, Lahore.

Mujadad has a Bachelors in Computer Science from the University of Central Punjab, and is currently employed as a Machine Learning Developer at Arbisoft.

Ikram Ali speaking to the local news talked about his idea on how one day he decided to research on the work being done on Urdu in the field of Natural Language Processing. He only found a few organizations working on making applications for Urdu and was disappointed to see only commercial work. Therefore, as a result, he set about to change that by making a full-fledged Urdu library.

Mujadad speaking about UrduHack said, “Our plan is to achieve the maximum possible heights with UrduHack. We want to make it a full-fledged Urdu NLP library which people can use to make thousands of interesting applications for desktop, mobile or web.”

Currently, the duo has managed to develop two core modules of the library, Normalization and Tokenization that are essential in cleaning and converting data from a cluttered form to a standard form. The library is still very much a work in progress according to the duo they are planning to use TensorFlow v2 in their upcoming modules later this month. 

Their journey hasn’t been without its own set of challenges while developing the app and have faced a number of technical difficulties such as the use of Unicode for the Urdu script. 

However, they were able to over the challenge by contacting Unicode Consortium and demanded a separate fixed Unicode block for Urdu. 

The second challenge is finding reliable and authentic data in Urdu. The UrduHack team is actively looking for Urdu data available in digital form and if anyone has access to Urdu data, they can contact mujadad.ali@arbisoft.com.

 

Reference links: propakistani.pk

Total
0
Shares
Share 0
Tweet 0
Pin it 0
Related Topics
  • Ikram Ali
  • Mujadad Rao
  • Open source
  • python
  • UrduHack
Sub Editor

Previous Article
  • Computerworld

EFU Life Brings LifeBot: AI Powered Messenger Chatbot

  • March 20, 2019
  • Content Desk
View Post
Next Article
  • Computerworld

Apollo & Huawei Host Customer Summit

  • March 23, 2019
  • Content Desk
View Post
You May Also Like
View Post
  • CIO
  • Computerworld
  • DEMO PAKISTAN
  • Ignite
  • Partnerships
  • Technology
  • Uncategorized

Trademor Selected as Alibaba.com’s First Pakistani Service Partner to Serve US Suppliers

  • Sub Editor
  • April 15, 2022
View Post
  • Computerworld
  • DEMO PAKISTAN
  • Ignite
  • News
  • Technology
  • Uncategorized

Tomorrow, the Cabinet will vote on the National Cyber Security Policy 2021.

  • Sub Editor
  • April 15, 2022
View Post
  • Computerworld
  • DEMO PAKISTAN
  • Ignite
  • Technology
  • Uncategorized

With an upcoming programme, Pakistan hopes to break into the $90 billion gaming market

  • Sub Editor
  • April 15, 2022
View Post
  • DEMO PAKISTAN
  • Ignite
  • Uncategorized

Arooj Aftab, a Pakistani musician, is featured on Barack Obama’s summer playlist.

  • Sub Editor
  • April 15, 2022
View Post
  • Business
  • PC World
  • Technology
  • Uncategorized

The Higher Education Commission (HEC) has launched e-courses to make Pakistani universities safer and more inclusive.

  • Sub Editor
  • April 15, 2022
View Post
  • Business
  • Computerworld
  • DEMO PAKISTAN
  • Ignite
  • Technology
  • Uncategorized

WorldCall has announced that it will start a ride-hailing service similar to Uber and Careem.

  • Sub Editor
  • April 15, 2022
View Post
  • Business
  • Computerworld
  • DEMO PAKISTAN
  • Ignite
  • Partnerships
  • Technology
  • Uncategorized

Western Union and Pakistan’s Faysal Bank have teamed up to offer real-time account payouts

  • Sub Editor
  • April 15, 2022
View Post
  • Uncategorized

PSI Pakistan x NICK Conclude Pakistan’s First Make Space Hackathon

  • webdesk
  • August 9, 2021

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

about
About
Launched in 1967 internationally Computerworld is the oldest tech magazine/ media property in the world. Today Computerworld (abbreviated as CW) is an ongoing decades old professional publication which in 2014 "went digital. In Pakistan Computerworld was launched in 1995.
Read more
Explore Computerworld Sites Globally
  • ComputerWorld.es
  • computerworld.com.pt
  • computerworldmexico.com.mx
  • cw.no
  • computerwoche.de
  • computerwelt.at
  • computersweden.idg.se
  • computerworld.hu
  • cwi.it
  • project.nikkeibp.co.jp
Content from other IDG brands
  • PCWorld
  • Macworld
  • Infoworld
  • TechHive
  • GameStar
  • Network world
CW Pakistan
  • CW MEDIA
  • CXO MEDIA
  • CHEZ WALLET
  • Demo
CW Media & all its sub brands are copyrighted to SPIN-IDG Wakhan Media Inc., the publishing arm of NCC-RP Group. Site is designed by Crunch Collective ©️ 2022

Input your search keywords and press Enter.