<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE ArticleSet PUBLIC "-//NLM//DTD PubMed 2.7//EN" "https://dtd.nlm.nih.gov/ncbi/pubmed/in/PubMed.dtd">
<ArticleSet>
<Article>
<Journal>
				<PublisherName>Semnan University</PublisherName>
				<JournalTitle>International Journal of Nonlinear Analysis and Applications</JournalTitle>
				<Issn>2008-6822</Issn>
				<Volume>12</Volume>
				<Issue>Special Issue</Issue>
				<PubDate PubStatus="epublish">
					<Year>2021</Year>
					<Month>12</Month>
					<Day>01</Day>
				</PubDate>
			</Journal>
<ArticleTitle>DNA barcoding using particle swarm optimization on apache spark SQL case study: DNA of covid-19</ArticleTitle>
<VernacularTitle></VernacularTitle>
			<FirstPage>1561</FirstPage>
			<LastPage>1572</LastPage>
			<ELocationID EIdType="pii">5812</ELocationID>
			
<ELocationID EIdType="doi">10.22075/ijnaa.2021.5812</ELocationID>
			
			<Language>EN</Language>
<AuthorList>
<Author>
					<FirstName>Lala Septem</FirstName>
					<LastName>Riza</LastName>
<Affiliation>Department of Computer Science Education, Universitas Pendidikan Indonesia, Indonesia</Affiliation>

</Author>
<Author>
					<FirstName>Muhammad Ilham</FirstName>
					<LastName>Nurfathiya</LastName>
<Affiliation>Department of Computer Science Education, Universitas Pendidikan Indonesia, Indonesia</Affiliation>

</Author>
<Author>
					<FirstName>Jajang</FirstName>
					<LastName>Kusnendar</LastName>
<Affiliation>Department of Computer Science Education, Universitas Pendidikan Indonesia, Indonesia</Affiliation>

</Author>
<Author>
					<FirstName>Khyrina Airin Fariza</FirstName>
					<LastName>Abu Samah</LastName>
<Affiliation>Faculty of Computer and Mathematical Sciences, University Teknologi MARA Cawangan Melaka Kampus Jasin, Melaka, Malaysia</Affiliation>

</Author>
</AuthorList>
				<PublicationType>Journal Article</PublicationType>
			<History>
				<PubDate PubStatus="received">
					<Year>2021</Year>
					<Month>08</Month>
					<Day>16</Day>
				</PubDate>
			</History>
		<Abstract>The objective of this research is to design and implement a computational model to determine DNA barcodes by utilizing the Particle Swarm Optimization (PSO) algorithms implemented on Big Data Platforms, namely Apache Hadoop and Apache Spark. The steps are as follows: (i) inputting DNA sequences to Hadoop Distributed File System (HDFS) in Apache Hadoop, (ii) pre-processing data, (iii) implementing PSO by utilizing the User Defined Function (UDF) in Apache Spark, (iv) collecting results and saving to HDFS. After obtaining the computational model, two following simulations have been done: the first scenario is using 4 cores and several worker nodes, meanwhile, the second one consists of a cluster with 2 worker nodes and several cores. In terms of computational time, the results show a significant acceleration between standalone and big data platforms with both experimental scenarios. This study proves that the computational model built on the big data platform shows the development of features and acceleration of previous research.</Abstract>
		<ObjectList>
			<Object Type="keyword">
			<Param Name="value">Big Data</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">Algorithm</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">Particle swarm optimization</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">Similarity check</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">Motif discovery</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">DNA barcoding</Param>
			</Object>
		</ObjectList>
<ArchiveCopySource DocType="pdf">https://ijnaa.semnan.ac.ir/article_5812_4bcaa895a499d080c10a0fb495eadacd.pdf</ArchiveCopySource>
</Article>
</ArticleSet>
