Hack, Learn, Secure: Your Cybersecurity Playground

Home

AI Analyser1

AI Analyser2

Example – Email Spam Detection

TAST 1 : Model Steps – Spam Classification using Naive Bayes Algorithm

 
          sudo apt install -y python3-pandas
          sudo apt install -y python3-numpy

          import numpy
          import pandas
          
        

          sudo apt install -y python3-sklearn

          from sklearn.feature_extraction.text import CountVectorizer
          from sklearn.model_selection import train_test_split
          from sklearn.naive_bayes import MultinomialNB
          
        

          data = pd.read_csv('https://raw.githubusercontent.com/AiDevNepal/ai-saturdays-workshop-8/master/data/spam.csv')
          data['target'] = np.where(data['target']=='spam', 1, 0)
          
        

          X_train, X_test, Y_train, Y_test = train_test_split(data['text'], data['target'], random_state=0)
          
        

          model.fit(X_train, Y_train)
          
        

          test_predictions = model.predict(X_test_vectorized)
          
        

Testing

Email: “1-month unlimited calls offer Activate now”
Is Spam: 1

Email: “Dear ABC, Congratulations! You have been selected as a Software Developer at XYZ Company. We were really happy to see your enthusiasm for this vision and mission. We are impressed with your background and we think you would make an excellent addition to the team.”
Is Spam: 0

TASK 2 : Model Steps – Spam Classification using Logistic Regression


          sudo apt install -y python3-pandas
          sudo apt install -y python3-numpy

          import numpy
          import pandas
          
        

          sudo apt install -y python3-sklearn
          from sklearn.feature_extraction.text import CountVectorizer
          from sklearn.model_selection import train_test_split
          from sklearn.linear_model import LogisticRegression
          
        

          data = pd.read_csv('https://raw.githubusercontent.com/AiDevNepal/ai-saturdays-workshop-8/master/data/spam.csv')
          data['target'] = np.where(data['target']=='spam', 1, 0)
          
        

          X_train, X_test, Y_train, Y_test = train_test_split(data['text'], data['target'], random_state=0)
          
        

          model = LogisticRegression(max_iter=1000)
          model.fit(X_train_vectorized, Y_train)